usap-dc-dev / usap-dc-website

repository for usap-dc website. Includes javascript client side app and python/flask server side.
2 stars 0 forks source link

update dataset ingestion SQL code to add data format to project_dataset table #36

Closed fnitsche2001 closed 10 months ago

fnitsche2001 commented 11 months ago

currently the dataset SQL in the curator tools has an INSERT command for the project_dataset table, but the INSERT command does not include the format, which has to be added manually by editing the table. It would be good to update the SQL template to include the format description. (Note that this might need to be manually edited by the curator in the SQL text, but it would still be easier to include it). maybe add an optional UPDATE command to project_dataset table for this

Also, when updating the dataset SQL template you can remove or change the line: UPDATE dataset SET review_person='Nitsche' WHERE id='60xxxx'; It is set to automatically default to 'Nitsche'. This is a relict from an earlier work flow. The best would be to replace this with the ORCID of the actual curator who is initiating the SQL script (or we just delete this line, since we capture the review also with the Fairscore table)

astrong-ldeo commented 10 months ago

Since the file names (most notably, their extensions) are stored in the JSON, if there's a list or dictionary of common file formats that our code can access, this shouldn't be very difficult. For now I've added a placeholder value "CHANGEME".

astrong-ldeo commented 10 months ago

Now in production with pull request #43. Gets the file type from the file name extension using our internal table of extensions to file type names.