philipmat / discogs-xml2db

Imports the discogs.com monthly XML dumps into databases
Apache License 2.0
205 stars 76 forks source link

Hexadecimal issue with experimental C Parser #126

Open sdaemi opened 4 years ago

sdaemi commented 4 years ago

Hi there

Apologies in advance for any inaccuracies, I'm new to this :)

I'm trying to parse the discogs_20110107_releases.xml file, but at about 16% I get the following error:

Unhandled exception. System.Xml.XmlException: '', hexadecimal value 0x07, is an invalid character. Line 5635965, position 485.
at System.Xml.XmlTextReaderImpl.Throw(Exception e) at System.Xml.XmlTextReaderImpl.Throw(String res, String[] args) at System.Xml.XmlTextReaderImpl.ParseText(Int32& startPos, Int32& endPos, Int32& outOrChars) at System.Xml.XmlTextReaderImpl.ParseText() at System.Xml.XmlTextReaderImpl.ParseElementContent() at System.Xml.XmlWriter.WriteNode(XmlReader reader, Boolean defattr) at System.Xml.XmlReader.ReadOuterXmlAsync() at discogs.Parser1.ParseStreamAsync(Stream stream) at discogs.Parser1.ParseFileAsync(String fileName) at discogs.Program.ParseAsync[T](String fileName, RunOptions options) at discogs.Program.ParseFile(String fileName, RunOptions options) at discogs.Program.Main(String[] args) at discogs.Program.

(String[] args) Abort trap: 6

This seems to happen with a few other dumps as well.

Any help on this would be much appreciated! Thanks!

philipmat commented 4 years ago

Thank you for the report. Two questions:

  1. What operating system and
  2. Does it happen with the most recent release file (202009)?
sdaemi commented 3 years ago

hi

sorry for the late reply. works fine with latest dump, thanks for the great tool!