Open aleksblendwerk opened 1 month ago
@aleksblendwerk - first of all, thanks for giving my library a try and reporting the bug!
Is PHP reporting any error? Is the gzip'ed XML properly formatted? What's the exit code of that script when you run it?
@aleksblendwerk - first of all, thanks for giving my library a try and reporting the bug!
You're welcome!
Is PHP reporting any error? Is the gzip'ed XML properly formatted? What's the exit code of that script when you run it?
PHP doesn't report any error and the process just exits normally, exit code 0
.
A timestamp I echo
after the fclose
is also printed.
The XML should be fine, I successfully loaded it using PHP's built-in XMLReader
.
One thing I noticed in the given XML file is that within the label
nodes it might contain a sublabels
node with child nodes called label
again. Maybe that's a case you haven't encountered with your parser before.
One thing I noticed in the given XML file is that within the label nodes it might contain a sublabels node with child nodes called label again. Maybe that's a case you haven't encountered with your parser before.
Might be. Can you submit the XML you're trying to parse? Or at least a small sample that can be used to reproduce the problem?
Might be. Can you submit the XML you're trying to parse? Or at least a small sample that can be used to reproduce the problem?
It is the file I linked in the initial post:
https://discogs-data-dumps.s3-us-west-2.amazonaws.com/data/2024/discogs_20240701_labels.xml.gz
As far as providing a small sample to reproduce it, that would probably require me to dig in too deep right now.
Hi there,
as I am currenty looking to speed up my database import code for Discogs' dump files, I just tried your library with this file: https://discogs-data-dumps.s3-us-west-2.amazonaws.com/data/2024/discogs_20240701_labels.xml.gz and I might be using it wrong anyway but it also seems to stop after a couple thousand nodes.
This is more or less my code:
The output ends with
Somehow parsing suddenly ends at about 1% into the file.
I haven't investigated this further yet, will look elsewhere for now but I just thought I'd report it.