mdoering / gbif-ecat

Automatically exported from code.google.com/p/gbif-ecat
0 stars 0 forks source link

Problems with diacritic marks #98

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
Taxon names that differ only in the presence/absence of diacritic marks ought 
to be treated as spelling variants rather than as separate names.

Examples:
4307394 Achelous
6458829 Acheloüs

4377003 Achnanthes plonensis
4920773 Achnanthes plönensis

4919066 Bütschliella
6009122 Butschliella

5977298 Pseudopanthera oberthuri
1954573 Pseudopanthera oberthüri

I think there are quite a few of these.  There are also many cases where the 
names are also obviously synonyms but the spelling difference reflects a 
transliteration of a diacritic, for example:

2630477 Achnanthes ploenensis
4893234 Buetschliella

Best
Jonathan

Original issue reported on code.google.com by jonathan...@gmail.com on 2 May 2013 at 7:30

GoogleCodeExporter commented 8 years ago
Thanks for the issue. it comes in time as I just started fixing our algorithm a 
little

Original comment by wixner@gmail.com on 2 May 2013 at 8:42