opendatalab / MinerU

A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。
https://opendatalab.com/OpenSourceTools
GNU Affero General Public License v3.0
11.37k stars 851 forks source link

輸出的表格怎麽是latex,如何指定為md格式 #434

Open HSIAOKUOWEI opened 1 month ago

HSIAOKUOWEI commented 1 month ago

Description of the bug | 错误描述

\begin{tabular}{|c|@{}c@{}|@{}c@{}|@{}c@{}|@{}c@{}|@{}c@{}|@{}c@{}|@{}c@{}|@{}c@{}|@{}c@{}|@{}c@{}|@{}c@{}|@{}c@{}|}\hline \multicolumn{2}{|c|}{} & \thead{Hsin-Chu} & \thead{Chu-Nan} & \thead{Chu-Nan} & \thead{Chu-Nan} & \thead{Chu-Nan} & \thead{Tong-Luo} & \thead{Tong-Luo} & \thead{Suzcho} & \thead{Suzhou} & \thead{Suzhou}\\multicolumn{2}{|c|}{} & \thead{Headqua\ der} & \thead{Fab 1} & \thead{Fab 2} & \thead{Fab 3} & \thead{Fab 4} & \thead{Fab 5} & \thead{Fab1} & \thead{Fab2} & \thead{Fab1} & \thead{Fab2} & \thead{Fab3}\\hline \thead{Total Floor} & \thead{Space} (m) & \thead{28,100} & \thead{82,500} & \thead{48,600} & \thead{65,000} & \thead{60,700} & \thead{31,200} & \thead{59,300} & \thead{58,700} & \thead{43,600} & \thead{29,000} & \thead{39,700} \\hline \thead{Clean\ Room} & \thead{Space} (m) & \thead{8,800} & \thead{34,600} & \thead{25,800} & \thead{30,100} & \thead{35,200} & \thead{19,500} & \thead{26,700} & \thead{26,000} & \thead{21,200} & \thead{19,100} & \thead{5,100} \\cline{2-14}& \thead{Cleanness} & \multicolumn{14}{c|}{\thead{Class 10/100/100/10K}}\\hline \end{tabular}

How to reproduce the bug | 如何复现

$ \begin{tabular}{|c|@{}c@{}|@{}c@{}|@{}c@{}|@{}c@{}|@{}c@{}|@{}c@{}|@{}c@{}|@{}c@{}|@{}c@{}|@{}c@{}|@{}c@{}|@{}c@{}|}\hline \multicolumn{2}{|c|}{} & \thead{Hsin-Chu} & \thead{Chu-Nan} & \thead{Chu-Nan} & \thead{Chu-Nan} & \thead{Chu-Nan} & \thead{Tong-Luo} & \thead{Tong-Luo} & \thead{Suzcho} & \thead{Suzhou} & \thead{Suzhou}\\multicolumn{2}{|c|}{} & \thead{Headqua\ der} & \thead{Fab 1} & \thead{Fab 2} & \thead{Fab 3} & \thead{Fab 4} & \thead{Fab 5} & \thead{Fab1} & \thead{Fab2} & \thead{Fab1} & \thead{Fab2} & \thead{Fab3}\\hline \thead{Total Floor} & \thead{Space} (m) & \thead{28,100} & \thead{82,500} & \thead{48,600} & \thead{65,000} & \thead{60,700} & \thead{31,200} & \thead{59,300} & \thead{58,700} & \thead{43,600} & \thead{29,000} & \thead{39,700} \\hline \thead{Clean\ Room} & \thead{Space} (m) & \thead{8,800} & \thead{34,600} & \thead{25,800} & \thead{30,100} & \thead{35,200} & \thead{19,500} & \thead{26,700} & \thead{26,000} & \thead{21,200} & \thead{19,100} & \thead{5,100} \\cline{2-14}& \thead{Cleanness} & \multicolumn{14}{c|}{\thead{Class 10/100/100/10K}}\\hline \end{tabular} $

Operating system | 操作系统

Windows

Python version | Python 版本

3.10

Software version | 软件版本 (magic-pdf --version)

0.7.x

Device mode | 设备模式

cuda

drunkpig commented 1 month ago

The current output is in LaTeX tables. We will support HTML output next week. Markdown tables will not be supported because they cannot handle cell merging.