ietf-tools / relaton-data-ids

Bibliographic data information for Internet-Drafts in Relaton format
7 stars 10 forks source link

Problem with missing draft prefix and trailing hyphen #2

Closed ronaldtse closed 2 years ago

ronaldtse commented 2 years ago

There are two entries with a missing draft prefix. In particular the second one has duplicated hyphens at the end.

How do we resolve them?

https://github.com/ietf-ribose/relaton-data-ids/blob/d7fd11beadea199a170cdccde011d93cf4fee1e9/data/DRAFT--PALE-EMAIL-00.yaml#L2-L17

https://github.com/ietf-ribose/relaton-data-ids/blob/d7fd11beadea199a170cdccde011d93cf4fee1e9/data/DRAFT--REMI-DESPRES--IPV6-RAPID-DEPLOYMENT--00.yaml#L2-L17

andrew2net commented 2 years ago

@ronaldtse we can squeeze hyphens in filenames filename.squeeze("-")

ronaldtse commented 2 years ago

@andrew2net are the double hyphens present in the original data?

andrew2net commented 2 years ago

@ronaldtse yes they are. For example, the ID of the document you mentioned above has it: I-D.-remi-despres--ipv6-rapid-deployment- Alternate we can remove the dot and suffix hyphen from using regex so the id I-D.-remi-despres--ipv6-rapid-deployment- becomes i-d-remi-despres--ipv6-rapid-deployment.yaml filename. Is it good?

ronaldtse commented 2 years ago

@andrew2net I think this is a problem with the data source, let's report it back.

ronaldtse commented 2 years ago

@rjsparks could you help confirm that the I-D.-remi-despres--ipv6-rapid-deployment- ID is as intended? Is there something we should do (clean) with it? Thanks!

rjsparks commented 2 years ago

No, you should not clean it - it was a real draft (that violated the normal conventions before we had something in place to prevent that) - it is in the repository (as you can see above) at: https://www.ietf.org/archive/id/draft--remi-despres--ipv6-rapid-deployment--00.txt

rjsparks commented 2 years ago

It's reasonable not to include these in the served corpus. (The datatracker won't serve bibxml for them directly - they only got generated in the tarball because I used a management command that went straight to the database and not through the url pattern enforcer).

ronaldtse commented 2 years ago

@rjsparks I have no qualms making these available, so let's just keep them around until the day we wish to update the slightly deviant identifier.