ssut / py-hanspell

파이썬 한글 맞춤법 검사 라이브러리. (네이버 맞춤법 검사기 사용)
MIT License
331 stars 117 forks source link

ParseError #19

Open bbq12340 opened 3 years ago

bbq12340 commented 3 years ago

Traceback (most recent call last): File "main.py", line 47, in df = process_data(df) File "main.py", line 19, in process_data processed_reviews = process_reviews(cleaned_reviews) File "/Users/bbq12340/dev/sentimentAnalysis/process.py", line 25, in process_reviews review = spell_checker.check(review).as_dict()['checked'] File "/Users/bbq12340/dev/sentimentAnalysis/env/lib/python3.8/site-packages/hanspell/spell_checker.py", line 68, in check 'checked': _remove_tags(html), File "/Users/bbq12340/dev/sentimentAnalysis/env/lib/python3.8/site-packages/hanspell/spell_checker.py", line 27, in _remove_tags result = ''.join(ET.fromstring(text).itertext()) File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/xml/etree/ElementTree.py", line 1320, in XML parser.feed(text) xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 1, column 226

파싱하는 과정에서 에러가 발생하였습니다. 에러 발생을 방지해야 할 것 같습니다.

oneonlee commented 3 years ago

저도 같은 에러가 발생하네요

JINHEE-KANG commented 2 years ago

저도 같은 에러가 발생합니다