.uni file type and spec

Currently, unified-doc has a .file method that supports outputting the source content in various file formats:

null: source/original file
.txt: a file containing only the textContent of the document.
.html: HTML version of the document.

In the future, support for .pdf and .docx file outputs could be possible when the unified ecosystem matures with relevant support with hast.

It's worthwhile to think of a file type that works seamlessly in the unified-doc (and also unified ecosystem). A brief pass for this spec includes:

Stores the hast tree
Stores important file information and metadata (e.g. filename, mimeType)
Optionally store annotations (based on the Annotation interface)
Optionally store the source content (this would almost double the filesize, but I'm not sure what are best practices here).
???

In the unified-doc ecosystem, if we have this file type specced, we can support it natively by simply reading the hast content, which are interoperable with unified-doc APIs, allowing us to very easily search/annotate/convert files without the need for specific parsers and compilers.

A backend system and data store can optionally choose to store files in .uni format, and bulk-process files of varying types with unified document APIs.

unified-doc / ideas

.uni file type and spec #2