gsautter / goldengate-qualitycontrol

Data Quality Control and Data Quality Assurance related tools for the GoldenGATE markup system.
Other
1 stars 0 forks source link

Suggestion: structure of Taxonomic Key treatments #77

Closed brokentool closed 4 years ago

brokentool commented 4 years ago

in: JNATHIST.53.21.22.1369-1384 pages 13 and 14

This type of error (broken inner structure) is not extremely common in keys, but still. In a taxonomic key treatment, all SSSections should be "key" by default (except the Nomenclature in the beginning). The one in the example has three types of SSSection, not applicable to keys.

gsautter commented 4 years ago

The only error I see in that key is that the "nomenclature" subSubSection doesn't include the leading "A" of the heading ... which is a pretty clear cut case of a broken treatment structure ... hard to tell how it ended up this way, but again, the batch gizmos work paragraph wise, so this is highly unlikely to be the result of automated markup.

Apparently, I'll have to have a chat with the Porto Alerge folks about what the markup should be like ...

gsautter commented 4 years ago

Looking even closer, the key in question was most likely not marked by the respective gizmo, as in the latter case the key steps would be individual paragraphs ... On top of that, this key is lacking the usual numbering on the steps, alternating with dashes on the second leads, and only has dashes on the individual steps instead ... the key markup gizmo wouldn't even find this one ... I tend to think we are facing an editorial issue here as well as a processing one ... will take care of it when I am in Porto Alegre.

brokentool commented 4 years ago

The only error I see in that key is that the "nomenclature" subSubSection

I think there were also Description sSSections in the key, and more Nomenclature in the middle. This is according to XML view --> Run Analyzer --> TreatmentStructurerOnline. could be wrong, I admit, but distinctly remember seeing these this morning.

gsautter commented 4 years ago

There was indeed another error ... the last subSubSection didn't include the terminal period ... after I fixed that, the "Broken Inner Structure" error went away. Anyway, all of this are hallmarks of slightly botched manual markup, so it's something for me to sort out with the people, no some misbehavior I could fix in a gizmo.

brokentool commented 4 years ago

a different case here: JNATHIST.52.17-18.1079-1094.pdf.imf page 12

How should I tag the paragraph below the title (currently tagged as Discussion, because Icould not think of anything else)? And the title itself, is it still Nomenclature?

gsautter commented 4 years ago

Are you sure JNATHIST.52.17-18.1079-1094.pdf.imf is the right document name? Was trying to get the UUID via the stats so I could take a look, but cannot seem to find it ... maybe just post a UUID, or simply a small little screenshot ...

brokentool commented 4 years ago

yea my bad, it's JNATHIST.53.17-18.1079-1094.pdf.imf

which one is the UUID?

brokentool commented 4 years ago

one more weird key here. is this still manually created? in: zootaxa.4743.2.9

scrn25

gsautter commented 4 years ago

About the last one, I suggest you run "Tools > Mark Taxonomic Keys" in the main window ... the paragraph structure looks as though this key should be tagged pretty nicely.

gsautter commented 4 years ago

JNATHIST.53.17-18.1079-1094.pdf.imf was really messed up ... some caption running across multiple pages ... I cleaned it up and did the QC afterwards ... looks like the underlying template needs a bit of work ...