VikParuchuri / marker

Convert PDF to markdown quickly with high accuracy
https://www.datalab.to
GNU General Public License v3.0
14.15k stars 720 forks source link

Hi, how could I convert a pdf each image has an output md? #139

Open MonolithFoundation opened 1 month ago

MonolithFoundation commented 1 month ago

Which iterally means, convert each page to a single sentence, so that the image and text pair can be used to training a model.