Closed Honesty-of-the-Cavernous-Tissue closed 3 months ago
I just edited your comment to replace the URL by the raw data, but I still cannot reproduce the bug with XML output, do you use particular options?
I just edited your comment to replace the URL by the raw data, but I still cannot reproduce the bug with XML output, do you use particular options?我刚刚编辑了您的评论,将 URL 替换为原始数据,但我仍然无法使用 XML 输出重现该错误,您是否使用特定选项?
sorry, i found out it's seems about the python version, my environment is 3.12.0
, there's no error in 3.9.18
My bad, the bug occurs when Trafilatura is used with Python, the CLI suppresses the error.
trafilatura
:1.12.1
raise by:
https://raw.githubusercontent.com/Honesty-of-the-Cavernous-Tissue/trafilatura/master/tests/test.html
ValueError: invalid literal for int() with base 10: '' from: https://github.com/adbar/trafilatura/blob/14c79c062bc331632de7a164477b45522b2150d0/trafilatura/xml.py#L321