Closed camiplata closed 7 months ago
These are not matches - they simply show where things have changed. It is not always easy to understand these diffs. Don't put any taxonomic meaning into them, they simply show where files differ. It quite often happens that a name was removed and another name added and these 2 are shown as "pairs" when really there is no pairing - they just happen to be in the same place when sorted alphabetically!
Also this is a based on the regular unix diff software and we do not have any influence on its performance
I do think there is room for improvement of the tool and the documentation about it.
If changes are shown as pairs like this:
It intermediately creates the idea on the user that that's how the diff is read for all names. If there is not a pairing the name could be shown alone like this as the tool already does:
On the other hand if the tool shows differences between datasets I do expect it to retrieve better pairs for example
here with Iphionidae
and here with Annelida
should had been paired
Probably this is not a higher priority issue, but if this a tool we are going to offer to the wider public we would need to make it better enventually, and questions/issues like the one I'm raising will eventually arrive again from users.
Agree it is unfortunate, but there is no way to improve these problems. It can only be done by better documenting what it does
Some example of incorrect name matche of two sources:
Link to the diff tool
Link to the diff tool