azgs / azlibrary_database

1 stars 1 forks source link

Minedata files have unnecessary reference to UA Library in links field #44

Closed aazaff closed 4 years ago

aazaff commented 4 years ago

For example, item ADMM-1552449141761-441 has

{
    links: [
        {
            url: "https://minedata.azgs.arizona.edu/report/bunker-hill-map-claims-and-underground-workings-sheet-4-4",
            name: "AZGS old"
        },
        { 
            url: null,
            name: "UA Library"
        }
    ]
}

Should just be

{
    links: [
        {
            url: "https://minedata.azgs.arizona.edu/report/bunker-hill-map-claims-and-underground-workings-sheet-4-4",
            name: "AZGS old"
        }
    ]
}

This can probably be fixed with an en-masse SQL query.

aazaff commented 4 years ago

jsonb_set may be appropriate, but I think this may be a really complicated query to implement actually.

NoisyFlowers commented 4 years ago

I gotta be honest, these metadata updates that have to examine everything entry metadata.json are a bit terrifying.

Nonetheless, here's what I think this one should look like:

with 
    null_links as (
        select
            s.metadata_id,
            s.index
        from (
            select
                m.metadata_id,
                l.value,
                l.ordinality-1 as index
            from
                metadata.azgs m,
                jsonb_array_elements(m.json_data->'links') with ordinality l                
        ) s
        where
            s.value@>'{"url":null}'
    )

update 
    metadata.azgs m
set 
    json_data =  jsonb_set(
        json_data, 
        '{links}', 
        (json_data->'links') - n.index::integer,
        false
       )
from
    null_links n
where 
    m.metadata_id = n.metadata_id
NoisyFlowers commented 4 years ago

I ran this update on dev, limiting it to metadata_id < 1000.

Using the select from the "with", I verified the number of records this would impact was 850 before running the update. The update ran for 21 seconds and came back successful with 850 rows effected. I checked the json on a random few of those and it looks as expected.

I'll leave it at that for now, in case you want to try it yourself on other records on dev.

NoisyFlowers commented 4 years ago

This is in ed4a01cd1ddcfc5069e058a3f001a6a8e032cb10

NoisyFlowers commented 4 years ago

We've decided to hold off on deploying this. I have removed it from the 11.sql patch with 865dbec.

To be honest, I'm not completely sure these metadata updates even belong in these patches. Is it enough to just have the sql statements documented here in the issue?

aazaff commented 4 years ago

Yeah, I think that it is totally reasonable to just have the SQL statements documented int eh issue. They don't have to be patch related.

aazaff commented 4 years ago

I don't recall if we ever pulled the trigger on this, but I think we determined this wasn't really necessary anyway.