Papermage currently extracts section headings, but does not extract text that belongs to those sections, even as it has sentences/paragraphs that can be associated.
Find a way to render a PDF in a natural, "hierarchical" reading order that allows us to annotate per-section metadata.
This can either be using PaperMage + heuristics, or it can be with a totally separate tool, like
watr-works or grobid
Papermage currently extracts section headings, but does not extract text that belongs to those sections, even as it has sentences/paragraphs that can be associated.
Find a way to render a PDF in a natural, "hierarchical" reading order that allows us to annotate per-section metadata.
This can either be using PaperMage + heuristics, or it can be with a totally separate tool, like watr-works or grobid
Tasks: