bcgov / ckanext-bcgov

BC Data Catalogue source code, main ckan extension
http://catalogue.data.gov.bc.ca
GNU Affero General Public License v3.0
24 stars 23 forks source link

resources for OFI records are incorrectly all caps format. Should be lowercase #386

Closed Mbrownshoes closed 5 years ago

Mbrownshoes commented 6 years ago

Resource format type should be lowercase, but the OFI auto resource creation is making them ALL CAPS https://catalogue.data.gov.bc.ca/api/3/action/package_show?id=6aa12006-5c23-4e3a-9145-29454b9ed493 This leads to duplicate formats (there are two 'csv's visible in the format search

screen shot 2017-12-20 at 11 46 32 am

@dkelsey

garrettH3S commented 6 years ago

@ll911 If you run these db scripts, that will fix the environment. UPDATE resource_revision SET format = LOWER(format); UPDATE resource SET format = LOWER(format);

This requires solr to be re-indexed. I'm not sure how these formats were entered into the database with different cases. The data entry point for format information is from a dropdown. screen shot 2018-01-05 at 1 49 30 pm

Has this recently been changed? Is there another way to add format information into the database?

ll911 commented 6 years ago

they were added via api, not sql insert

dkelsey commented 6 years ago

@garrettH3S this is marked as Code Complete it is fixed?

garrettH3S commented 6 years ago

@ll911 Have you ran that script in cad?

ll911 commented 6 years ago

@dkelsey yes, you can see the config file to get the cad disqus id

ll911 commented 6 years ago

@garrettH3S your group have full access in CAD.

dkelsey commented 6 years ago

@ll911 when I navigate to the disqus url for CAD I get Whoops, you're not allowed on that page.

dkelsey commented 6 years ago

i'm confused about the disqus comments. why are they here? @dkelsey Because I made a comment about the disqus api before, and we started responding here. I must have forgotten to remove your comment.

dkelsey commented 6 years ago

I re-indexed in CAD and it didn't make a difference.

dkelsey commented 6 years ago

I used the API to update the metadata for resources that had the wrong case, setting all to lower case. I re-indexed. This resolved the 'duplicate formats' issue.
This exposed another issue #478.

In addition, I encountered a dataset that had an incomplete schema. Because of this Solr failed to updated the index correctly for this record. I added the missing schema elements, through the API, which resolved this issue. I suspect the other issue is due to an issue Solr has with the schema.

I looked in PROD and I see that there are resources with incorrect formats. We can update those at any time.

dkelsey commented 6 years ago

Ran the script in PROD. I think a re-index is required @ll911

dkelsey commented 6 years ago

There are still 2 groupings of CSV facets.

hleckenb commented 5 years ago

@jachurchill can you look into why the two CSV facets are showing up ... both are in lowercase, however, seems to be separated. Please identify what the reason is.

jachurchill commented 5 years ago

This can probably be closed as it will be resolved with #524