Currently, the species lists provided occasionally have two entries that are considered as synonyms in GBIF taxonomic classification. For example, Eublemma minutata and Eublemma paula. When fetching the taxonomic keys for such species, using the 01-fetch_taxon_keys.py, the resulting keys CSV file retains both entries, labeling one of the rows as an "ACCEPTED" match on GBIF while the other one is labeled as "SYNONYM" in the "status" column. They noth, however, have the same taxonomic ID.
When subsequently trying to download images for these species, the 02-fetch_gbif_moth_data.py filters the keys file for each taxonomic ID, and expects only one entry. If the keys file contains both the ACCEPTED and the SYNONYM entries, an error occurrs.
Proposed solution involves a species list preprocessing step, whereby the synonym rows are removed. However, sometimes the source list contains a species name which is a synonym on GBIF, but it does not contain the primary species name. In such a case, the ouput of 01-fetch_taxon_keys.py will have only one row for that taxonomic ID labeled as SYNONYM. In this case, this row should not be deleted.
Currently, the species lists provided occasionally have two entries that are considered as synonyms in GBIF taxonomic classification. For example, Eublemma minutata and Eublemma paula. When fetching the taxonomic keys for such species, using the
01-fetch_taxon_keys.py
, the resulting keys CSV file retains both entries, labeling one of the rows as an "ACCEPTED" match on GBIF while the other one is labeled as "SYNONYM" in the "status" column. They noth, however, have the same taxonomic ID.When subsequently trying to download images for these species, the
02-fetch_gbif_moth_data.py
filters the keys file for each taxonomic ID, and expects only one entry. If the keys file contains both the ACCEPTED and the SYNONYM entries, an error occurrs.Proposed solution involves a species list preprocessing step, whereby the synonym rows are removed. However, sometimes the source list contains a species name which is a synonym on GBIF, but it does not contain the primary species name. In such a case, the ouput of
01-fetch_taxon_keys.py
will have only one row for that taxonomic ID labeled as SYNONYM. In this case, this row should not be deleted.