Open tcatapano opened 5 years ago
You mean the XML "alternative format", right? The primary file is an XHTML. Anyway ...
Pre-upload validation is not a problem, but we need to define the schema first. GG XML intentionally is without a schema so we can easily add new features or mark new details. Keeping it valid against a closed-world schema would require filtering against a positive list of elements and thus withhold details from clients that we actually have annotated. I'd like to propose we have one stripped-down schema-bound XML, and one that has all the details without having to await schema evolution. The former could as well be TaxPub, one we have that finished. Just a thought ... what do you think?
@gsautter: Ive validated the html samples provided at https://github.com/plazi/stable-treatment-html against the XHTML 1.0 (Strict) https://www.w3.org/TR/xhtml1/ . Its a small sample, but it does suggest that the current transformation will output compliant XHTML. The only thing that needs to be done is to add the DOCTYPE declaration at the top of the file. Use:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
I still aim to spec the JATS/Taxub to be used as the deposition file format, but adding the DOCTYPE allows us to minimally meet the requirement of depositing a standard compliant file.
Started adding <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
now ... do we require strict?
No, we don't require Strict, but since the samples validated against it I figured we'd go with it since it's slightly preferable to Transitional. Id like to see how a larger sample does against Strict, however.
Here's the last example, with DOCTYPE declaration and the reference across the top: https://sandbox.zenodo.org/record/361030
Using the online validator at https://validator.w3.org/ reveals one minor, but easily fixed, error:
Missing xmlns attribute for element html. The value should be: http://www.w3.org/1999/xhtml
Thanks, check out https://sandbox.zenodo.org/record/362646
That gives an error because a namespace prefix has been declared but not used. Just need to either remove the prefix in the namespace declaration or use the prefix for all the HTML elements in the file. I vote for just removing the prefix from the declaration.
OK, got it ... this one validates now: https://sandbox.zenodo.org/record/362648
Whatever standard published schema is used for deposition upload xml, it should be: