magnusmanske / auth2wd

Convert data from Authority Control sites into a Wikidata item structure
2 stars 0 forks source link

BNF reference #4

Closed SimonVilleneuvewp closed 23 hours ago

SimonVilleneuvewp commented 3 months ago

Hi,

The tool add a BNF reference even if it is already present in the property value. As an example, the tool add the same BNF reference in the value of P31 here . The only difference is that the P268 qualifier is in the new reference.

We can see that the qualifier is the last part of the P854 qualifier of the first BNF reference for this value of P31. Maybe you can add to the tool a filter who can add the P268 qualifier to the first BNF reference when the value of P268 is the same one as the http://data.bnf.fr[...]/cb$1 in the P854 qualifier of the first BNF reference ?

magnusmanske commented 23 hours ago

Actually this is surprisingly difficult to do. The merger code ("is this the same reference?") [https://github.com/magnusmanske/wikimisc/blob/6e1cba2ca420ba241f1584cb1cfbd8dfff649fa4/src/item_merger.rs#L190 here] compares the presence of external IDs. It does not know about the URL pattern.

The only way I could see that won't involve larger rewrites would be to add both the external ID and the reference URL to the new reference, and then compare the URL. That would filter out the reference but (a) in your case above, it would not add the external ID, and (b) I seem to remeber that both the external ID and the reference URL in one reference is frowned upon.

In summary, I would rather leave this alone, and someone will write a bot to clean these things up :-)