mbakeranalecta / sam

Semantic Authoring Markdown
Other
79 stars 8 forks source link

encoding vs language attributes #122

Closed mbakeranalecta closed 7 years ago

mbakeranalecta commented 7 years ago

we have attributes for human languages, which is a standard language code preceded by an exclamation point:

(!fr)

and we have attributes to computer/data languages/encodings, which is just a language name

(python)

The current language spec calls the former a language attribute and the latter an encoding attribute. Colloquially, they are both called languages.

There is no structure in SAM that can take both of them, so there is no reason they have to be separate, have separate syntax, or be called by different names.

However, human languages do have standardized names and computer languages do not. We can enforce the use of standard human language codes if we keep them separate. But should we? That enforcement can be done at the application layer if it is wanted. It could even be done at the schema layer. Is there any real reason to do it at the language layer?

If not, we can get rid of the distinction and just use language for attributes. That would also cut down on the overloading of the ! while leaving it available for other uses in the future.

mbakeranalecta commented 7 years ago

An argument for requiring ISO 639 language codes for language is that it allows us to output xml:lang attributes in serialization.

But as long as language types don't overlap, this is not an impediment to dropping the ! prefix.

mbakeranalecta commented 7 years ago

Can't persuade myself that there is sufficient reason for making this change. Having a prefix to denote the ISO639 language code make it unambiguous and the meaning of the codes might not be clear in context without some specific flag.