wergstatt / pprkrkn

0 stars 0 forks source link

Uniqueness of url #3

Closed wergstatt closed 4 years ago

wergstatt commented 4 years ago

The preferred unique identifier for a journal-publisher combination is the URL of the journal. Sadly, there are two examples, where the link is not a unique identifier. As those are not relevant at all, it should be a valid strategy to filter them.

wergstatt commented 4 years ago

Currently, duplicates are filtered in python, which is far from optimal, as it should be part of the T process, but the case shouldn't exist, thus it is not worth solving the problem for a single irrelevant incident.

Consequences of ignoring the duplicate are minor.