DS4SD / docling

Get your documents ready for gen AI
https://ds4sd.github.io/docling
MIT License
10.48k stars 507 forks source link

Add LaTex and mathpix-markdown-it as outputs #343

Open sirus20x6 opened 1 week ago

sirus20x6 commented 1 week ago

Requested feature

Add Latex and mathpix-markdown-it formats as available outputs

nougat uses Mathpix Markdown and it creates beautiful formulas because it's powered by LaTex

example paper I had on hand The Impact of Initialization on LoRA Finetuning Dynamics

Formula from the paper:

image

docling version in markdown image

and this is from nougat image

znsoftm commented 1 week ago

Good idea. It is neccesary for researchers.

PeterStaar-IBM commented 6 days ago

@sirus20x6 If you want this functionality, please make a PR in https://github.com/DS4SD/docling-core complementing the export_to_markdown (see here)