plazi / GoldenGATE-Imagine

A GUI Tool For Freeing Text and Data from PDF Documents
Other
5 stars 0 forks source link

import of reference via DOI: remove <i> in article title? #38

Open myrmoteras opened 1 year ago

myrmoteras commented 1 year ago

Is there a possibility to remove the annotations in the title of a reference inserted via CROSS REF search?

eg Description of a new species of <i>Pinnotheres</i>, and redescription of <i>P. novaezelandiae</i> (Brachyura: Pinnotheridae) via http://dx.doi.org/10.1080/03014223.1983.10423904

I fixed it here FFDCFFE9F443B620FF9EE619FFBBFFBE

gsautter commented 1 year ago

I think we should do this, yes ... you got this metadata via the "Search" button in the GGI metadata dialog, I take it? Just so I know where to add the tag stripping ...

gsautter commented 1 year ago

The respective treatment & article (for testing): https://tb.plazi.org/GgServer/html/03E58791F444B62AFF2CEE21FDC5FA8B https://tb.plazi.org/GgServer/summary/FFDCFFE9F443B620FF9EE619FFBBFFBE

myrmoteras commented 1 year ago

I got this from the "search" using the DOI in the add metadata step in GGI.

gsautter commented 1 year ago

Done, come with next update.

Implemented normalization also covers removal of any invisible characters, and normalization of spaces and dashes.

gsautter commented 1 year ago

Done, come with next update.

Implemented normalization also covers removal of any invisible characters, and normalization of spaces and dashes.