OregonDigital / OD2

Next generation of Oregon Digital ( https://oregondigital.org ) digital collections platform, built on Samvera Hyrax ( https://github.com/samvera/hyrax/ )
18 stars 1 forks source link

Location and Publication Place field labels sometimes displaying with wrong hierarchy text #2452

Open wickr opened 2 years ago

wickr commented 2 years ago

Descriptive summary

Some labels for fields using the Location controlled vocabulary are showing wrong/extra hierarchy text. Examples:

Location: Salem >> Marion County >> Oregon >> United States >> Oregon >> United States https://prod.oregondigital.org/catalog?f%5Blocation_combined_label_sim%5D%5B%5D=Salem+%3E%3E+Marion+County+%3E%3E+Oregon+%3E%3E+United+States+%3E%3E+Oregon+%3E%3E+United+States&locale=en&q%5B%5D=&search_field=all_fields#content

Location: Silver Falls State Park >> Marion County >> Oregon >> United States >> Oregon >> United States https://prod.oregondigital.org/catalog?f%5Blocation_combined_label_sim%5D%5B%5D=Silver+Falls+State+Park+%3E%3E+Marion+County+%3E%3E+Oregon+%3E%3E+United+States+%3E%3E+Oregon+%3E%3E+United+States&locale=en&q%5B%5D=&search_field=all_fields#content

Publication Place: Corvallis >> Benton County >> Oregon >> United States >> United States >> Oregon >> United States >> United States https://prod.oregondigital.org/concern/documents/8k71p767k#data_sources

The work in the above link has the correct label for Corvallis in Location, but the wrong label in Publication Place, despite both fields having the exact same URI.

Expected behavior

All Location and related fields using the Location CV show the appropriate label with correct hierarchy.

Related work

1124

wickr commented 2 years ago

@CGillen Metadeities explored more about cities being in multiple counties, for which I'll outline what we want below. But that doesn't seem to account for the above problems with Silver Falls State Park and Corvallis.

When there are multiple parent counties for a given city, we want to change the display:

Examples:

Portland is in 3 counties: Portland >> Clackamas/Multnomah/Washington Counties >> Oregon >> United States Salem is in 2: Salem >> Marion/Polk Counties >> Oregon >> United States

We should be set for any occurrence of multiple counties, which is somewhat common: https://en.wikipedia.org/wiki/List_of_U.S._municipalities_in_multiple_counties

sseymore commented 1 year ago

@wickr can you assist? I am seeing it corrected for the Corvallis label in Location and Place of Pub, but for the others it appears to be duplicating parts of the label.

Screen Shot 2022-10-04 at 12 11 36 PM Screen Shot 2022-10-04 at 12 15 38 PM Screen Shot 2022-10-04 at 12 15 11 PM

wickr commented 1 year ago

Yeah, confirmed that Corvallis is showing correctly in both fields. If there's still issues there they'll probably be solved with a reindex. But the Silver Falls State Park example is still not correct. I did a reindex on one just now but that didn't resolve it.

sseymore commented 1 year ago

@wickr does this need to move back to Ready?

CGillen commented 1 year ago

Ok, the blazegraph cache has Marion County's preferred label as Marion County >> Oregon >> United States, then we drill down deeper to get the state + country.

This is a problem with the blazegraph cache, I'll clean this one up.

I'm very bad with sparql though, so if @wickr knows of a way to query all geonames objects with a skos:prefLabel attribute, specifically if it containers *Oregon >> United States, we should probably delete those

CGillen commented 1 year ago

So maybe I can find it again later, this is a sparql command to delete a triple when you know the subject and the predicate. The internet tells me I should be able to just use the exact triple, but BG is not recognizing string literals

PREFIX  skos: <http://www.w3.org/2004/02/skos/core#>

DELETE { <https://sws.geonames.org/5739051/> skos:prefLabel ?o }
WHERE { <https://sws.geonames.org/5739051/> skos:prefLabel ?o }
CGillen commented 1 year ago

Works you QA with will need reindexed. A re-save should do it

sseymore commented 1 year ago

QA pass with resaving a work referenced above on prod. I just did one to QA. @wickr Will we reindex at some point before launch?

Screen Shot 2022-12-05 at 12 29 26 PM

wickr commented 1 year ago

Confirmed, things are looking good after reindexing, at least the works with 'Silver Falls State Park' and 'Salem' so far.

I don't know that we'll have time or a need to do a full reindex of everything. But we can do targeted reindexes of collections or whatever combo of metadata/facets we need. These ones mostly came up in migration QA, so if we continue to identify issues I'd like to get them resolved. And some are findable by browsing the facets and looking for outliers.

wickr commented 1 year ago

Reopening. After reindexing some of the works with old location labels, I noticed that we still don't have consistent ordering of multiple counties. This is presenting as separate facet labels.

The counties should be alphabetized for consistency. https://github.com/OregonDigital/OD2/issues/2452#issuecomment-1245965464

Examples:

[Atlanta >> DeKalb/Fulton Counties >> Georgia >> United States](https://prod.oregondigital.org/catalog?f%5Blocation_combined_label_sim%5D%5B%5D=Atlanta+%3E%3E+DeKalb%2FFulton+Counties+%3E%3E+Georgia+%3E%3E+United+States&locale=en&q%5B%5D=&search_field=all_fields) 3
[Atlanta >> Fulton/DeKalb Counties >> Georgia >> United States](https://prod.oregondigital.org/catalog?f%5Blocation_combined_label_sim%5D%5B%5D=Atlanta+%3E%3E+Fulton%2FDeKalb+Counties+%3E%3E+Georgia+%3E%3E+United+States&locale=en&q%5B%5D=&search_field=all_fields)
[Aurora >> Aurora Township >> DuPage/Kendall/Kane/Will Counties >> Illinois >> United States](https://prod.oregondigital.org/catalog?f%5Blocation_combined_label_sim%5D%5B%5D=Aurora+%3E%3E+Aurora+Township+%3E%3E+DuPage%2FKendall%2FKane%2FWill+Counties+%3E%3E+Illinois+%3E%3E+United+States&locale=en&q%5B%5D=&search_field=all_fields) 2
[Aurora >> Aurora Township >> Kane/Kendall/Will/DuPage Counties >> Illinois >> United States](https://prod.oregondigital.org/catalog?f%5Blocation_combined_label_sim%5D%5B%5D=Aurora+%3E%3E+Aurora+Township+%3E%3E+Kane%2FKendall%2FWill%2FDuPage+Counties+%3E%3E+Illinois+%3E%3E+United+States&locale=en&q%5B%5D=&search_field=all_fields) 3
[Aurora >> Aurora Township >> Kendall/DuPage/Will/Kane Counties >> Illinois >> United States](https://prod.oregondigital.org/catalog?f%5Blocation_combined_label_sim%5D%5B%5D=Aurora+%3E%3E+Aurora+Township+%3E%3E+Kendall%2FDuPage%2FWill%2FKane+Counties+%3E%3E+Illinois+%3E%3E+United+States&locale=en&q%5B%5D=&search_field=all_fields) 4
[Aurora >> Aurora Township >> Kendall/Will/DuPage/Kane Counties >> Illinois >> United States](https://prod.oregondigital.org/catalog?f%5Blocation_combined_label_sim%5D%5B%5D=Aurora+%3E%3E+Aurora+Township+%3E%3E+Kendall%2FWill%2FDuPage%2FKane+Counties+%3E%3E+Illinois+%3E%3E+United+States&locale=en&q%5B%5D=&search_field=all_fields)
[Portland >> Clackamas/Multnomah/Washington Counties >> Oregon >> United States](https://prod.oregondigital.org/catalog?f%5Blocation_combined_label_sim%5D%5B%5D=Portland+%3E%3E+Clackamas%2FMultnomah%2FWashington+Counties+%3E%3E+Oregon+%3E%3E+United+States&locale=en&q%5B%5D=&search_field=all_fields) 53
[Portland >> Clackamas/Washington/Multnomah Counties >> Oregon >> United States](https://prod.oregondigital.org/catalog?f%5Blocation_combined_label_sim%5D%5B%5D=Portland+%3E%3E+Clackamas%2FWashington%2FMultnomah+Counties+%3E%3E+Oregon+%3E%3E+United+States&locale=en&q%5B%5D=&search_field=all_fields) 13
[Portland >> Multnomah/Clackamas/Washington Counties >> Oregon >> United States](https://prod.oregondigital.org/catalog?f%5Blocation_combined_label_sim%5D%5B%5D=Portland+%3E%3E+Multnomah%2FClackamas%2FWashington+Counties+%3E%3E+Oregon+%3E%3E+United+States&locale=en&q%5B%5D=&search_field=all_fields) 74
[Portland >> Multnomah/Washington/Clackamas Counties >> Oregon >> United States](https://prod.oregondigital.org/catalog?f%5Blocation_combined_label_sim%5D%5B%5D=Portland+%3E%3E+Multnomah%2FWashington%2FClackamas+Counties+%3E%3E+Oregon+%3E%3E+United+States&locale=en&q%5B%5D=&search_field=all_fields) 55
[Portland >> Washington/Clackamas/Multnomah Counties >> Oregon >> United States](https://prod.oregondigital.org/catalog?f%5Blocation_combined_label_sim%5D%5B%5D=Portland+%3E%3E+Washington%2FClackamas%2FMultnomah+Counties+%3E%3E+Oregon+%3E%3E+United+States&locale=en&q%5B%5D=&search_field=all_fields) 4
[Portland >> Washington/Multnomah/Clackamas Counties >> Oregon >> United States](https://prod.oregondigital.org/catalog?f%5Blocation_combined_label_sim%5D%5B%5D=Portland+%3E%3E+Washington%2FMultnomah%2FClackamas+Counties+%3E%3E+Oregon+%3E%3E+United+States&locale=en&q%5B%5D=&search_field=all_fields)
CGillen commented 1 year ago

Make sure to reindex before testing. This should consistently sort counties alphanumerically. Just a regular Array#sort

sseymore commented 1 year ago

Asking @wickr to QA this one when he can, just to make sure it's all good since I missed works/data last time.

wickr commented 1 year ago

QA Fail. For the most part, this is working well. I've reindexed all of the problematic Locations I could find. Major ones like 'Portland' are all cleared up.

There's a couple exceptions, which seem to be when there's a 'Township' in the second level, or something else pushes 'County' to the third level. Actually each time you load the work show page, the label on Data Sources, that's reloaded each time, can change (county order). Example: https://prod.oregondigital.org/concern/images/df698z79h?locale=en#data_sources

[Aurora >> Aurora Township >> Kendall/DuPage/Kane/Will Counties >> Illinois >> United States](https://prod.oregondigital.org/catalog?f%5Blocation_combined_label_sim%5D%5B%5D=Aurora+%3E%3E+Aurora+Township+%3E%3E+Kendall%2FDuPage%2FKane%2FWill+Counties+%3E%3E+Illinois+%3E%3E+United+States&locale=en&q%5B%5D=&search_field=all_fields) 4
[Aurora >> Aurora Township >> Kendall/DuPage/Will/Kane Counties >> Illinois >> United States](https://prod.oregondigital.org/catalog?f%5Blocation_combined_label_sim%5D%5B%5D=Aurora+%3E%3E+Aurora+Township+%3E%3E+Kendall%2FDuPage%2FWill%2FKane+Counties+%3E%3E+Illinois+%3E%3E+United+States&locale=en&q%5B%5D=&search_field=all_fields) 5
[Aurora >> Aurora Township >> Kendall/Will/DuPage/Kane Counties >> Illinois >> United States](https://prod.oregondigital.org/catalog?f%5Blocation_combined_label_sim%5D%5B%5D=Aurora+%3E%3E+Aurora+Township+%3E%3E+Kendall%2FWill%2FDuPage%2FKane+Counties+%3E%3E+Illinois+%3E%3E+United+States&locale=en&q%5B%5D=&search_field=all_fields) 3
[Aurora >> Aurora Township >> Will/DuPage/Kane/Kendall Counties >> Illinois >> United States](https://prod.oregondigital.org/catalog?f%5Blocation_combined_label_sim%5D%5B%5D=Aurora+%3E%3E+Aurora+Township+%3E%3E+Will%2FDuPage%2FKane%2FKendall+Counties+%3E%3E+Illinois+%3E%3E+United+States&locale=en&q%5B%5D=&search_field=all_fields) 14

link: https://prod.oregondigital.org/catalog/facet/location_combined_label_sim?facet.page=11&facet.sort=index&locale=en&q%5B%5D=&search_field=all_fields&utf8=%E2%9C%93

[Kansas City >> Cass/Clay/Jackson/Kaw Township >> Platte Counties >> Missouri >> United States](https://prod.oregondigital.org/catalog?f%5Blocation_combined_label_sim%5D%5B%5D=Kansas+City+%3E%3E+Cass%2FClay%2FJackson%2FKaw+Township+%3E%3E+Platte+Counties+%3E%3E+Missouri+%3E%3E+United+States&locale=en&q%5B%5D=&search_field=all_fields) 7
[Kansas City >> Cass/Clay/Kaw Township >> Jackson/Platte Counties >> Missouri >> United States](https://prod.oregondigital.org/catalog?f%5Blocation_combined_label_sim%5D%5B%5D=Kansas+City+%3E%3E+Cass%2FClay%2FKaw+Township+%3E%3E+Jackson%2FPlatte+Counties+%3E%3E+Missouri+%3E%3E+United+States&locale=en&q%5B%5D=&search_field=all_fields) 21
[Kansas City >> Cass/Jackson/Kaw Township >> Clay/Platte Counties >> Missouri >> United States](https://prod.oregondigital.org/catalog?f%5Blocation_combined_label_sim%5D%5B%5D=Kansas+City+%3E%3E+Cass%2FJackson%2FKaw+Township+%3E%3E+Clay%2FPlatte+Counties+%3E%3E+Missouri+%3E%3E+United+States&locale=en&q%5B%5D=&search_field=all_fields) 18
[Kansas City >> Clay/Jackson/Kaw Township >> Cass/Platte Counties >> Missouri >> United States](https://prod.oregondigital.org/catalog?f%5Blocation_combined_label_sim%5D%5B%5D=Kansas+City+%3E%3E+Clay%2FJackson%2FKaw+Township+%3E%3E+Cass%2FPlatte+Counties+%3E%3E+Missouri+%3E%3E+United+States&locale=en&q%5B%5D=&search_field=all_fields) 9
[Kansas City >> Kaw Township >> Cass/Clay/Platte/Jackson Counties >> Missouri >> United States](https://prod.oregondigital.org/catalog?f%5Blocation_combined_label_sim%5D%5B%5D=Kansas+City+%3E%3E+Kaw+Township+%3E%3E+Cass%2FClay%2FPlatte%2FJackson+Counties+%3E%3E+Missouri+%3E%3E+United+States&locale=en&q%5B%5D=&search_field=all_fields) 2
[Kansas City >> Kaw Township >> Clay/Platte/Cass/Jackson Counties >> Missouri >> United States](https://prod.oregondigital.org/catalog?f%5Blocation_combined_label_sim%5D%5B%5D=Kansas+City+%3E%3E+Kaw+Township+%3E%3E+Clay%2FPlatte%2FCass%2FJackson+Counties+%3E%3E+Missouri+%3E%3E+United+States&locale=en&q%5B%5D=&search_field=all_fields) 6
[Kansas City >> Kaw Township >> Jackson/Platte/Cass/Clay Counties >> Missouri >> United States](https://prod.oregondigital.org/catalog?f%5Blocation_combined_label_sim%5D%5B%5D=Kansas+City+%3E%3E+Kaw+Township+%3E%3E+Jackson%2FPlatte%2FCass%2FClay+Counties+%3E%3E+Missouri+%3E%3E+United+States&locale=en&q%5B%5D=&search_field=all_fields) 2
[Kansas City >> Kaw Township >> Jackson/Platte/Clay/Cass Counties >> Missouri >> United States](https://prod.oregondigital.org/catalog?f%5Blocation_combined_label_sim%5D%5B%5D=Kansas+City+%3E%3E+Kaw+Township+%3E%3E+Jackson%2FPlatte%2FClay%2FCass+Counties+%3E%3E+Missouri+%3E%3E+United+States&locale=en&q%5B%5D=&search_field=all_fields) 3
[Kansas City >> Kaw Township >> Platte/Cass/Clay/Jackson Counties >> Missouri >> United States](https://prod.oregondigital.org/catalog?f%5Blocation_combined_label_sim%5D%5B%5D=Kansas+City+%3E%3E+Kaw+Township+%3E%3E+Platte%2FCass%2FClay%2FJackson+Counties+%3E%3E+Missouri+%3E%3E+United+States&locale=en&q%5B%5D=&search_field=all_fields) 4
[Kansas City >> Kaw Township >> Platte/Clay/Cass/Jackson Counties >> Missouri >> United States](https://prod.oregondigital.org/catalog?f%5Blocation_combined_label_sim%5D%5B%5D=Kansas+City+%3E%3E+Kaw+Township+%3E%3E+Platte%2FClay%2FCass%2FJackson+Counties+%3E%3E+Missouri+%3E%3E+United+States&locale=en&q%5B%5D=&search_field=all_fields) 4
[Kansas City >> Kaw Township >> Platte/Clay/Jackson/Cass Counties >> Missouri >> United States](https://prod.oregondigital.org/catalog?f%5Blocation_combined_label_sim%5D%5B%5D=Kansas+City+%3E%3E+Kaw+Township+%3E%3E+Platte%2FClay%2FJackson%2FCass+Counties+%3E%3E+Missouri+%3E%3E+United+States&locale=en&q%5B%5D=&search_field=all_fields) 2
[Kansas City >> Kaw Township >> Platte/Jackson/Cass/Clay Counties >> Missouri >> United States](https://prod.oregondigital.org/catalog?f%5Blocation_combined_label_sim%5D%5B%5D=Kansas+City+%3E%3E+Kaw+Township+%3E%3E+Platte%2FJackson%2FCass%2FClay+Counties+%3E%3E+Missouri+%3E%3E+United+States&locale=en&q%5B%5D=&search_field=all_fields) 1
[Kansas City >> Kaw Township >> Platte/Jackson/Clay/Cass Counties >> Missouri >> United States](https://prod.oregondigital.org/catalog?f%5Blocation_combined_label_sim%5D%5B%5D=Kansas+City+%3E%3E+Kaw+Township+%3E%3E+Platte%2FJackson%2FClay%2FCass+Counties+%3E%3E+Missouri+%3E%3E+United+States&locale=en&q%5B%5D=&search_field=all_fields) 1

link: https://prod.oregondigital.org/catalog/facet/location_combined_label_sim?facet.page=99&facet.sort=index&locale=en&q%5B%5D=&search_field=all_fields&utf8=%E2%9C%93

wickr commented 1 year ago

QA Pass. Tried the same locations on staging and it's looking good. image

carakey commented 1 year ago

Getting instances of the same problem with Geonames, only with a river instead of a city, with multiple administrative regions in Italy as parents in the hierarchy. FWIW these regions have Geonames code ADM1 (first-order administrative division) where the counties have code ADM2 (second-order administrative division), and the river has code STM (stream).

Po River (https://sws.geonames.org/3170550/) Occurs in 9 works currently in review in the International Freshwater Treaties Collection and has 5 different versions in the facet list.

KevinJonesMeta commented 1 year ago

To add to comment above on Po River entries, the additional ADM1 regions included are listed as parents outside the administrative hierarchy in Geonames. Can system ignore that parents list and only provide administrative hierarchy?

Example Works: https://oregondigital.org/concern/documents/gb19fn88t https://oregondigital.org/concern/documents/gb19fn052 https://oregondigital.org/concern/generics/9c67wr825 https://oregondigital.org/concern/documents/9c67x064v https://oregondigital.org/concern/documents/gb19fn90v https://oregondigital.org/concern/documents/gb19fm94r