Closed Klortho closed 8 years ago
For optimal re-use, the recommendation should be to not use any named entities.
If internal named entities are needed to make up for missing Unicode characters then they obviously have to be there, but with the caveat that they may not be accessible in some contexts (e.g. web browsers using their native XML parser).
We have made it an error if any named entities are used (other than <
, >
, '
, "
, and &
. I think, that's probably best, and easiest for producers and consumers, rather than trying to deal with the technicalities of internal subsets.
See also issue #1, which is specifically about character entity references.
More generally, there are lots of different kinds of entities that can be used in XML. I think it would be nice if we could make it an error to use any external entity in any JATS4R document. Let me explain by example, including the CERs already discussed.
Character entity references
These are the things like
©
that are defined in the JATS DTD. I'm proposing we make these a warning.Internal entities
These are defined in the internal subset of the document type. For example
I think these should be allowed, because any good XML parser will not have any problem with them. Although, I don't know if they work in the browsers' parsers -- we should check that.
External entities
I think we should disallow the use of any external entities. For example:
TBD.