usnistgov / oscal-tools

Tools for the OSCAL project
https://pages.nist.gov/oscal-tools/
34 stars 17 forks source link

Demo/application concept: OSCAL HTML Reducer #68

Closed wendellpiez closed 1 year ago

wendellpiez commented 1 year ago

Users who have HTML may find it useful to be able to reduce their content into the OSCAL-compatible subset of HTML.

This necessarily implies a data loss, if by 'data' we include HTML markup which would be washed. But all the page content (tagged copy) could be maintained such that any page without dynamic transclusions (that is, whose copy is all on the page) could be so 'reduced', leaving all the copy ordered and intact, and legible for the most part.

Basically, headers, simple list items, paragraphs and simple tables would be kept and everything else would be either cleared (leaving contents), or cast into these basic (allowed) elements.

Users would have to decide if the outputs were sufficiently faithful to the source, for reuse in OSCAL.

This could work with either a literal content paste (into a form field) or a local file load.

The result would be valid in an OSCAL context. With a local Save As, a user could keep this for reuse (copying into OSCAL).

An optional 'paranoid' mode could show a traceback of modifications to any extent wanted.

(Another option could support structural induction into parts using header levels?)

A 'Markdown' mode could write out the Markdown representation instead of the XML-tagged representation, for copying out or saving.

NB: this approaches the same problem as usnistgov/OSCAL#547, but from the opposite direction.

wendellpiez commented 1 year ago

A conceptual demo for this is here: https://pages.nist.gov/xslt-blender/html-reducer/

Problem remains as to how anyone who might want such a thing, can find it.