Closed ic-xu closed 6 months ago
I get an exception as follows:
/python3.10/site-packages/openparse/tables/pymupdf/parse.py", line 25, in output_to_markdown markdown_output = "| " + " | ".join(headers) + " |\n" TypeError: sequence item 2: expected str instance, NoneType found
When parsing PDF tables, the output format is set to
table_args={ "parsing_algorithm": "pymupdf", "table_output_format": "markdown" }
After analysis, I found that the reason may be the following: When the headers of the table are:
header = ['(See Note 11)', '', None, None]
Then execute the following code
markdown_output = "| " + " | ".join(headers) + " |\n" markdown_output += "|---" * len(headers) + "|\n"
You will get the following error
So my solution is to replace None with ' ' to solve this problem
behavior:
I get an exception as follows:
When parsing PDF tables, the output format is set to
After analysis, I found that the reason may be the following: When the headers of the table are:
Then execute the following code
You will get the following error
So my solution is to replace None with ' ' to solve this problem