Open wangguo1230 opened 3 days ago
Is there an existing issue for the same bug?
- [x] I have checked the existing issues.
Branch name
main
Commit ID
Other environment information
OS type: Windows 11
Actual behavior
解析pdf时发生错误,Traceback (most recent call last): File "D:\pythonprojects\ragflow\deepdoc\parser\pdf_parser.py", line 1175, in test = ragflow("C:\Users\wangg\Desktop\3.pdf") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\pythonprojects\ragflow\deepdoc\parser\pdf_parser.py", line 1031, in call self._concat_downward() File "D:\pythonprojects\ragflow\deepdoc\parser\pdf_parser.py", line 516, in _concat_downward dfs(boxes[0], 1) File "D:\pythonprojects\ragflow\deepdoc\parser\pdf_parser.py", line 507, in dfs fea = self._updown_concat_features(up, down) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\pythonprojects\ragflow\deepdoc\parser\pdf_parser.py", line 113, in _updown_concat_features up["text"][-1] + down["text"][0]) else "")
~~^^^^ IndexError: string index out of rangeExpected behavior
No response
Steps to reproduce
if __name__ == "__main__": ragflow = RAGFlowPdfParser() test = ragflow("3.pdf")
Additional information
Uploading 3.pdf…
PDF link is incorrect.
Sorry this 6.pdf @Feiue
Is there an existing issue for the same bug?
Branch name
main
Commit ID
0cb588f7
Other environment information
Actual behavior
解析pdf时发生错误,Traceback (most recent call last): File "D:\pythonprojects\ragflow\deepdoc\parser\pdf_parser.py", line 1175, in test = ragflow("C:\Users\wangg\Desktop\3.pdf") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\pythonprojects\ragflow\deepdoc\parser\pdf_parser.py", line 1031, in call self._concat_downward() File "D:\pythonprojects\ragflow\deepdoc\parser\pdf_parser.py", line 516, in _concat_downward dfs(boxes[0], 1) File "D:\pythonprojects\ragflow\deepdoc\parser\pdf_parser.py", line 507, in dfs fea = self._updown_concat_features(up, down) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\pythonprojects\ragflow\deepdoc\parser\pdf_parser.py", line 113, in _updown_concat_features up["text"][-1] + down["text"][0]) else "") \