rism-digital / muscat

🗂️ A Rails application for the inventory of handwritten and printed music scores
http://muscat-project.org
34 stars 16 forks source link

Update BNF authority file links #1564

Open ahankinson opened 2 months ago

ahankinson commented 2 months ago

The BNF authority links are a bit of a mess.

Some cite the full ARK, e.g., https://muscat.rism.info/admin/institutions/40006238. This doesn't work because the link formatter expects just a number. So it turns it in to https://data.bnf.fr/ark:/12148/cbark:/12148/cb14783478b.

Some records cite just the number, e.g., https://muscat.rism.info/admin/institutions/51005989 gives 15386579.

However, this also does not work, because it's missing the check digit (4) in this case, which is part of the ARK. In other words:

ark:/12148/cb153865794 -> cb is the prefix, 15386579 is the identifier, 4 is the check digit.

But wait! It gets better.

According to this page: https://www.bnf.fr/fr/lidentifiant-ark-archival-resource-key, the BNF have several schemes for their ARK prefixes: dp for digitized documents, cb for catalogue records, and mm for educational resources. I happen to also know that there are others, e.g., ark:/12148/btv1b105513309. The best way to untangle all of these is to go through their ARK resolver, https://ark.bnf.fr.

So, this is just a long-winded way of saying that:

Tagging @Docudoctor, since I've already roped him in to this on Muscat.

xhero commented 2 months ago

I can surely change the hardcoded link from https://data.bnf.fr/ark:/12148/cb to to https://ark.bnf.fr/, but how should we go to fix all the identifiers?

xhero commented 2 months ago

I updated the link so it should work in the next release. @alexandermarxen and @Docudoctor could you have a look to see what is the path to fix the data here? We can probably isolate the identifiers that do not start with "ark:" but they will need to be checked by hand.

alexandermarxen commented 2 months ago

Thank you very much! I'm afraid that there are a lot of records, especially when it comes to persons. Could I have a list of the affected records?

ahankinson commented 2 months ago

I think these are the people records that have a bnf identifier, but that identifier doesn't start with "ark". ~2,100.

ark_people.csv

alexandermarxen commented 2 months ago

Thank you very much! It's about the order of magnitude I was expecting.

BaMikusi commented 2 months ago

(FYI: Guido will be able to look into this only next week.)

Docudoctor commented 2 months ago

Hi! There are round about 30 institution records with a BNF identifier. Some with ark, some without. I can easyly find them in "Any field". My question is, how can I cite the BNF identifiers?

ahankinson commented 2 months ago

You should include the full ARK, including the ark:/ prefix. So for example:

https://muscat.rism.info/admin/institutions/40006238/edit

This is cited correctly, even though the link does not work. The link will start working after the next Muscat release.

This record, however:

https://muscat.rism.info/admin/institutions/40001144

Is not correct. The BnF identifier should be: ark:/12148/cb12229245w

You can see this in the BnF catalogue: https://catalogue.bnf.fr/ark:/12148/cb12229245w Look under the "Identifiant de la notice" section.

Docudoctor commented 2 months ago

I know, but I was a little bit confused, because of the change of the hardcoded link. So good luck! :)