Filimoa / open-parse

Improved file parsing for LLM’s
https://filimoa.github.io/open-parse/
MIT License
2.55k stars 100 forks source link

TypeError: sequence item 13: expected str instance, NoneType found #31

Closed mingzhang798 closed 7 months ago

mingzhang798 commented 7 months ago

Initial Checks

Description

When I use mardown mode, I got the following error: Traceback (most recent call last): File "C:/Program Files/JetBrains/PyCharm 2023.3.5/plugins/python/helpers/pydev/pydevd.py", line 1534, in _exec pydev_imports.execfile(file, globals, locals) # execute the script File "C:\Program Files\JetBrains\PyCharm 2023.3.5\plugins\python\helpers\pydev_pydev_imps_pydev_execfile.py", line 18, in execfile exec(compile(contents+"\n", file, 'exec'), glob, loc) File "D:\Project\pdf_parse_Project\pdf_open_parse.py", line 34, in custom_10k = parser.parse(path) File "C:\Users\anaconda3\envs\open_parse\lib\site-packages\openparse\doc_parser.py", line 106, in parse table_elems = tables.ingest(doc, table_args_obj, verbose=self._verbose) File "C:\Users\anaconda3\envs\open_parse\lib\site-packages\openparse\tables\parse.py", line 221, in ingest return _ingest_with_pymupdf(doc, parsing_args, verbose) File "C:\Users\anaconda3\envs\open_parse\lib\site-packages\openparse\tables\parse.py", line 59, in _ingest_with_pymupdf text = pymupdf.output_to_markdown(headers, lines) File "C:\Users\anaconda3\envs\open_parse\lib\site-packages\openparse\tables\pymupdf\parse.py", line 25, in output_to_markdown markdown_output = "| " + " | ".join(headers) + " |\n" TypeError: sequence item 13: expected str instance, NoneType found

Example Code

No response

Filimoa commented 7 months ago

Fixed with #32