Currently, unified-doc has a .file method that supports outputting the source content in various file formats:
null: source/original file
.txt: a file containing only the textContent of the document.
.html: HTML version of the document.
In the future, support for .pdf and .docx file outputs could be possible when the unified ecosystem matures with relevant support with hast.
It's worthwhile to think of a file type that works seamlessly in the unified-doc (and also unified ecosystem). A brief pass for this spec includes:
Stores the hast tree
Stores important file information and metadata (e.g. filename, mimeType)
Optionally store annotations (based on the Annotation interface)
Optionally store the source content (this would almost double the filesize, but I'm not sure what are best practices here).
???
In the unified-doc ecosystem, if we have this file type specced, we can support it natively by simply reading the hast content, which are interoperable with unified-doc APIs, allowing us to very easily search/annotate/convert files without the need for specific parsers and compilers.
A backend system and data store can optionally choose to store files in .uni format, and bulk-process files of varying types with unified document APIs.
Currently,
unified-doc
has a.file
method that supports outputting the source content in various file formats:null
: source/original file.txt
: a file containing only thetextContent
of the document..html
: HTML version of the document.In the future, support for
.pdf
and.docx
file outputs could be possible when theunified
ecosystem matures with relevant support withhast
.It's worthwhile to think of a file type that works seamlessly in the
unified-doc
(and alsounified
ecosystem). A brief pass for this spec includes:hast
treefilename
,mimeType
)Annotation
interface)In the
unified-doc
ecosystem, if we have this file type specced, we can support it natively by simply reading thehast
content, which are interoperable withunified-doc
APIs, allowing us to very easily search/annotate/convert files without the need for specific parsers and compilers.A backend system and data store can optionally choose to store files in
.uni
format, and bulk-process files of varying types with unified document APIs.