iftechfoundation / ifdb-suggestion-tracker

Bugs and feature requests for a future IFDB update
10 stars 0 forks source link

Use a validation algorithm to check IFIDs #441

Open alice-blue opened 6 months ago

alice-blue commented 6 months ago

From https://intfiction.org/t/ifdb-search-anomaly/66571/12:

Finally, as you can see, there are currently no validation rules for IFDB’s IFIDs. Reading over the Treaty, I think someone could cook up a validation algorithm. I’d accept a PR for this, but I’m probably not going to do it myself.

[The following part is about cleaning up IFIDs that have already been entered into IFDB, but I'm quoting it in case it's helpful in putting together an algorithm]

If somebody wants to clean up bogus IFIDs, I think that’d make sense. Here’s how I found them.

Initially, I queried for IFIDs that didn’t contain a hyphen, like this: select ifid from ifids where ifid not like '%-%';

That turned up 320 IFIDs, most of which looked like this:

ifid 004D9E39231A1797D3A8A58ADA7C02F4 00CC0D46A7A54FD08D89FB4652DA3CDC 0182F9777087D22739413F6B7C1896FA 01A08F7D37CB02940A1D6EAA7E57E5F9 04FC82FF2020CED57E315493435C59C1

Do those need to be “fixed up”? I’m honestly unsure. They appear to be 32-character UUIDs without their hyphens.

So then I searched for all IFIDs where the length was not exactly 32 or 36 characters long, and where the IFID does not start with any of the known, declared prefixes from the Treaty of Babel.

select ifid
from ifids
where 
ifid not like 'ADRIFT-%'
and ifid not like 'ADVSYS-%'
and ifid not like 'AGT-%'
and ifid not like 'ALAN-%'
and ifid not like 'GLULX-%'
and ifid not like 'HUGO-%'
and ifid not like 'LEVEL9-%'
and ifid not like 'MAGNETIC-%'
and IFID not like 'MZ-%'
and ifid not like 'TADS2-%'
and ifid not like 'TADS3-%'
and ifid not like 'ZCODE-%'
and length(ifid) not in (32, 36);