VikParuchuri / marker

Convert PDF to markdown quickly with high accuracy
https://www.datalab.to
GNU General Public License v3.0
17.74k stars 1.02k forks source link

Images #195

Open famda opened 5 months ago

famda commented 5 months ago

Hi, First of all, let me say that this is an awesome work!

I'm trying marker to make some experiments with some challenging pdf documents and overall results are great. However, I have some questions regarding images.

Is it possible to embed images on the final markdown document?

I noticed that, when images are next to each other, they are joined as a single image. Is it possible to keep them separate?

Thanks in advance.

famda commented 5 months ago

Another issue i noticed is that, in some cases, images are splited in half. Is there any alternative to avoid this issue?

drunkpig commented 4 months ago

@famda try this project https://github.com/opendatalab/MinerU, this tool has a powerful layout detect model and a strong post-processor pipeline