Closed jeff-at-h3 closed 6 years ago
There is a field that shows up in the Additional Information section of OFI Resources:
The field is resource_storage_location
under resource
@jeff-at-h3 SOURCE_DATABASE
is intended to describe if the object is stored in the BCGW or in the NRS Operational DB.
I quickly poked around and found that resource_storage_location
is coming from resource.extras
I don't think a new column is necessary.
@dkelsey Just to clarify, within the form, the resource_storage_location
fields options need to be updated (see above) and no modification to the schema is necessary.
@dkelsey Does this look correct to you?
Confirmed Resource Storage Location implemented in CAD environment. Soliciting feedback from NRS stakeholders concerning list items. List items will have acronyms spelled out and optionally followed by popular acronym in brackets, e.g., BC Geographic Warehouse (BCGW). Will follow up with stakeholder feedback.
For this release, the domain values for the pick list need to be changed to the following, in order of appearance:
"Unspecified" should be the default value.
Please advise when ready to QA.
Also, impact assessment of removing unneeded locations has not been done.
"SDE" and "SDO" would be changed to "Ministry or other database" unless there is clear indication that the BCGW is the location.
"GeoDB" would be changed to "File system".
"X-Y", "Converge" and "External" would change to "Unspecified" unless there are indicators in the record content to help make a different choice.
I am unsure if there are dependencies on the legacy values planned to be removed and unsure if removing them will cause application issues if legacy values are not recast. Advice please.
@cnewallbcgov It's easy enough to change the forms select values, and map the values over within the code and database. In testing the api, I found that no validation is done when setting the resource_storage_location; thus, any external system adding resources to a dataset via the api, will not be given an error when adding. They will be adding resource_storage_locations that are no longer within the list.
Steps to migrate these data types 1) Update system vocabulary. 2) Update hardcoded storage location regions within ckanext-bcgov 3) MIgrate database values (run db scripts) 4) Add api validation layer to validate resource_storage_location input. ( api call is /dataset/new_resource/{id} )
This is what is in CAT:
There are a couple of corrections to make, some things I think should be removed and I have questions:
location | correction |
---|---|
BCGW Data Store | BC Geographic Warehouse |
Converge | What is this? |
EDC Data Store | This is problematic. Some files will be stored in the FileStore others will be in the DataStore. The catalogue does not store and data; it only manages metadata. |
External | ? |
GeoDB | How is this a location? |
SDE | How is this a location? Isn't this SPATIAL_DATA_TYPE ? |
SDO | ibid |
X-Y | I don't know what this is? |
pub.data.gov.bc.ca | How about DataBC hosted? External could represent anyone else |
@cnewallbcgov lets clarify the list.
I specified the selection values in an earlier comment. The CAT environment should reflect these values else this is a bug.
ArcGIS Online should be accommodated - a new value "Esri ArcGIS Online" placed on the list ahead of FTP site.
Thank you
Hi Colin,
Jared will look into this issue today and figure out the gaps. Sorry for any confusion on this one.
Jeff
On Mar 14, 2018, at 9:34 AM, Colin Newall notifications@github.com wrote:
I specified the selection values in an earlier comment. The CAT environment should reflect these values else this is a bug.
ArcGIS Online should be accommodated - a new value "Esri ArcGIS Online" placed on the list ahead of FTP site.
Thank you
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/bcgov/ckanext-bcgov/issues/395#issuecomment-373088204, or mute the thread https://github.com/notifications/unsubscribe-auth/AgWB9Z42K9icDRuuaAjp1x19SEOUKs9-ks5teUangaJpZM4RJBW0.
@cnewallbcgov @dkelsey there seems to be a miss step in the issue process with this ticket, as this ticket was in New Issues and not in Review/QA. It should have not been migrated to TESTING.
@garrettH3S had some concerns with regarding the API that needs your approval @cnewallbcgov, because there is additional work needed regarding API usage to validate the input of the resource_storage_locations
field. For example, any user using the API can input any text they wish that would not match the list in the web ui.
Re. Garrett's concerns: is this validation issue introduced by these specific changes to the pick list or has it always been risk that exists in the Production environment today?
If not a new issue then please make the changes, as we have previously accepted the risk. A separate issue ticket should be created to described the opportunity to improve data validation via the API. Work not be in scope for 1.7.0.
Please confirm. Thanks
@cnewallbcgov for this field, no validation exists in all environments currently. I will create a new issue for the validation.
As for the list value, I would just like to confirm with you what should be present as per yours and @dkelsey's comments, in order & as-displayed:
default is Unspecified
I'd prefer Catalogue Data Store be change to Resource Store Only CSV's are stored in the DataStore. Users are not going to know the difference between when certain things are in the File Store and other are in both the File Store and the Data Store.
Further a problem already exists where people think the catalogue stores data. It does not. The catalogue only stores metadata. The Data Store and File Store are things that add the capability to store small files. adding 'Catalogue' contributes to user congnative load and confusion.
For BC Geographic Warehouse: For now editors add the Object Name
to the metadata record. The resources associated are not configured directly by the user, a widget is run on their behalf; the widget will add a resources that is a "custom download" and set source_database at that time. Said another way: no-one will ever manually set source database to BC Geographic Warehouse. I think it should be removed from the list. I would be set only by the widget that creates the "custom download"
@dkelsey , is this waiting approval by @cnewallbcgov before we start any work on it?
@jeff-at-h3 yes.
Here is the approved list in order of appearance:
default is Unspecified
Thanks
@cnewallbcgov I need to change 'BC Geographic Warehouse (BCGW)' to 'BC Geographic Warehouse BCGW' because ckan only allows alphanumeric character and these symbols: - _ .
In that case let's omit "BCGW". Thanks
Editors can add resources as by uploading them or specifying a Link.
This example is a bit pedantic however, do we want to add remote url to the list? @cnewallbcgov
What is the intention of unspecified?
Re. adding "remote URL". How about modifying "FTP Site" to be "Web or FTP Site" ?
Re. "Unspecified" - default value, better than null/blank.
"Web or FTP Site" good enough for me. So when you stated "default is unspecified" you meant "The default value is the string 'unspecified'" and not "the default value is unspecified or blank or the empty string"
I was thinking the value would be "Unspecified" instead of blank.
@cnewallbcgov Just to clarify what needs to be changed,
Is this correct?
Not quite. No need to add "Remote URL".
@dkelsey I see the issue you mentioned about blank values for SOURCE_DATABASE
, looks like there was the ignore_missing validator, I'll try applying the not_empty validator instead
Updated the list, vocab list will need to be updated in cad.
As for the validators, delivered and verified in cad, prevents resource_storage_location from being empty, however, with the api any value can be stored that's not in the edc_vocab list that's usually displayed in the webui.
I've run the script that updates the vocab.
I've updated the vocabulary in PROD. BCGW Data Store EDC Data Store were removed.
Need to be able to identify the source of the data, e.g., BCGW, OSDB
Need to re-add the UI
Needs more technical analysis Next steps: 1-2 hours to investigate if the changes are programmatic or are there other implications