allenai / papermage

library supporting NLP and CV research on scientific papers
https://papermage.org
Apache License 2.0
692 stars 54 forks source link

Merging documents / Standardizing protocols #55

Open kyleclo opened 1 year ago

kyleclo commented 1 year ago

Rasterizers, DocMetadataExtractors, Parsers, etc.

all should emit Documents

And we should have some sort of Doc.update() functionality or merge() functionality to combine Documents.