gbif / backbone-feedback

2 stars 0 forks source link

Eristalis lineata Wahlberg, 1843 is a synonym of Cheilosia morio, but should not have its own site. The pictures and distribution show that of Eristalis lineata Harris, 1776! #165

Open gbif-portal opened 1 year ago

gbif-portal commented 1 year ago

Eristalis lineata Wahlberg, 1843 is a synonym of Cheilosia morio, but should not have its own site. The pictures and distribution show that of Eristalis lineata Harris, 1776!


User: See in registry - Send email System: Chrome 108.0.5359 / Windows 10.0.0 Referer: https://www.gbif.org/species/1540234 Window size: width 960 - height 468 API log&_a=(columns:!(_source),filters:!(),index:'3390a910-fcda-11ea-a9ab-4375f2a9d11c',interval:auto,query:(language:kuery,query:''),sort:!())) Site log&_a=(columns:!(_source),filters:!(),index:'5c73f360-fce3-11ea-a9ab-4375f2a9d11c',interval:auto,query:(language:kuery,query:''),sort:!())) System health at time of feedback: INFO

ManonGros commented 1 year ago

It looks like most of the records are indeed Eristalis lineata Harris, 1776 but for some reason, they are matched to Eristalis lineata Wahlberg, 1843. I am not sure why, is ir because of the parenthesis?

Field Value
Scientific name Eristalis lineata
Scientific name authorship (Harris, 1776)

Is is the expected behaviour @mdoering ? Should I ask the publisher to remove parenthesis?

Issue partially relates to this one: https://github.com/gbif/backbone-feedback/issues/169

mdoering commented 1 year ago

Yes, parenthesis are rather important as we compare combination authors and basionym authors separately. With brackets there is no combination author to compare so it matches.

In fact both names match equally, both scoring 97. The algorithm then just picks one rather randomly as they are both synonyms of the same accepted name - i.e. ultimately it doesnt matter which is picked. It will be different if they are not synonyms of the same accepted name.

mdoering commented 1 year ago

I guess we should improve the matching to also try to match authorships ignoring parenthesis, scoring just a little lower as full matches.

mdoering commented 1 year ago

I have implemented that, so soon we'll match both with and without parenthesis to Harris not Wahlberg