# @ARCHIVUM ontologia/cor.hxltm.yml
# @DESCRIPTIONEM HXL Trānslātiōnem Memoriam (HXLTM)
# @LICENTIAM Dominium publicum
formatum:
# (...)
HXLTM-ASA:
__meta:
archivum_extensionem:
- .asa.hxltm.json
- .asa.hxltm.yml
normam:
- https://hdp.etica.ai/hxltm/archivum/#HXLTM-ASA
descriptionem: |
_[eng-Latn]
The HXLTM-ASA is an not strictly documented Abstract Syntax Tree
of an data conversion operation.
This format, different from the HXLTM permanent storage, is not
meant to be used by end users. And, in fact, either JSON (or other
formats, like YAML) are more a tool for users debugging the initial
reference implementation hxltmcli OR developers using JSON
as more advanced input than the end user permanent storage.
Warning: The HXLTM-ASA is not meant to be an stricly documented format
even if HXLTM eventually get used by large public. If necessary,
some special format could be created, but this would require feedback
from community or some work already done by implementers.
[eng-Latn]_
Trivia:
- abstractum, https://en.wiktionary.org/wiki/abstractus#Latin
- syntaxim, https://en.wiktionary.org/wiki/syntaxis#Latin
- arborem, https://en.wiktionary.org/wiki/arbor#Latin
- conceptum de Abstractum Syntaxim Arborem
- https://www.wikidata.org/wiki/Q127380
nomen:
eng-Latn: 'HXLTM Abstractum Syntaxim Arborem'
situs_interretialis:
referens_officinale:
- https://hdp.etica.ai/hxltm
- https://github.com/EticaAI/HXL-Data-Science-file-formats/issues/223
- https://github.com/EticaAI/HXL-Data-Science-file-formats/labels/HXLTM
The idea of create a format to use HXL to store both translation memories (not just the XLIFF format) but also glossaries but in special terminology is hardcore. Not so from the code implementation, but from the point of the issue it tries to abstract is complex.
Even if mostly for internal usage (e.g. not strictly documented for external use) instead of we 'convert' HXLated data (aka CSVs) to other formats (in special the XML ones) we're already drafting what could be called an Abstrac Syntax Tree (https://en.wikipedia.org/wiki/Abstract_syntax_tree). It can be a simpler one, but at least we're not passing to converters raw CSV pointers.
Comparison to others linguistic Abstract Syntax
See also:
Abstract Syntax as Interlingua: Scaling Up the Grammatical Framework from Controlled Languages to Robust Pipelines
Turns out that do exist some long time ideas about abstract linguistic content, but what could be called 'HXLTM ASA' is more at container level (as it could be useful to convert from file types) than at term level (as it would be to undestand what a term is to use for translate concepts).
So even if HXLTM ASA becomes usable for external tools, we will not even try to do too much micro management. BUT one thing we could do here is intentionally let it easy for others to convert for whatever format they want and we do not try to be strict on what HXLTM ASA is, so if someone else would want to inject even more details at term level, they could.
On Grammatical Framework
The Grammatical Framework (that is cited a lot on the Abstract Syntax as Interlingua) seems to be the state of the ar of how to generate a way to understand sentences in different natural languages. I, Rocha, do not plan to go deep on this, since the sort to medium term interest is more about how to store terminology and translations memories, and if the minimal implementation to support TBX export already can take time, the best I could do is make easier to (if do exist interest year later) people use HXLTM dialects to store linguistic data while still have decent portability between other data formats.
The idea of create a format to use HXL to store both translation memories (not just the XLIFF format) but also glossaries but in special terminology is hardcore. Not so from the code implementation, but from the point of the issue it tries to abstract is complex.
Even if mostly for internal usage (e.g. not strictly documented for external use) instead of we 'convert' HXLated data (aka CSVs) to other formats (in special the XML ones) we're already drafting what could be called an Abstrac Syntax Tree (https://en.wikipedia.org/wiki/Abstract_syntax_tree). It can be a simpler one, but at least we're not passing to converters raw CSV pointers.
Comparison to others linguistic Abstract Syntax
Turns out that do exist some long time ideas about abstract linguistic content, but what could be called 'HXLTM ASA' is more at container level (as it could be useful to convert from file types) than at term level (as it would be to undestand what a term is to use for translate concepts).
So even if HXLTM ASA becomes usable for external tools, we will not even try to do too much micro management. BUT one thing we could do here is intentionally let it easy for others to convert for whatever format they want and we do not try to be strict on what HXLTM ASA is, so if someone else would want to inject even more details at term level, they could.
On Grammatical Framework
The Grammatical Framework (that is cited a lot on the Abstract Syntax as Interlingua) seems to be the state of the ar of how to generate a way to understand sentences in different natural languages. I, Rocha, do not plan to go deep on this, since the sort to medium term interest is more about how to store terminology and translations memories, and if the minimal implementation to support TBX export already can take time, the best I could do is make easier to (if do exist interest year later) people use HXLTM dialects to store linguistic data while still have decent portability between other data formats.