weka511 / nlp

My experiments with Natural Language Processing. I've created a few programs to try out concepts.
GNU General Public License v3.0
1 stars 0 forks source link

Parsing errors in blogs.zip #37

Closed weka511 closed 1 year ago

weka511 commented 1 year ago

E.g. ExpatError: blogs/1000866.female.17.Student.Libra.xml 83 103 not well-formed (invalid token): line 103, column 225 ExpatError: blogs/1004904.male.23.Arts.Capricorn.xml 83 437 not well-formed (invalid token): line 437, column 24 ExpatError: blogs/1005076.female.25.Arts.Cancer.xml 83 309 not well-formed (invalid token): line 309, column 345 ExpatError: blogs/1005545.male.25.Engineering.Sagittarius.xml 83 655 not well-formed (invalid token): line 655, column 796 ExpatError: blogs/1007188.male.48.Religion.Libra.xml 83 165 not well-formed (invalid token): line 165, column 269 ExpatError: blogs/100812.female.26.Architecture.Aries.xml 83 130 not well-formed (invalid token): line 130, column 342 ExpatError: blogs/1008329.female.16.Student.Pisces.xml 83 127 not well-formed (invalid token): line 127, column 568