There are two bulk metadata updates on the project level, that we'd like to do.
Reasoning
NRES addition in all open access datasets
After the introduction of managed access datasets in the portal, we would like to add the data_use_restriction field in the metadata of all open access projects i.e. all projects of the portal that this update was not done in the previous bulk update in #1270. This would require bumping the project schema version to version 19.0.0 and add the field "data_use_restriction": "NRES" in the project metadata.
Bionetwork backfilling
Dave asked us to add the bionetwork information in the schema, since portal started showing the biological network on the front page by default. There are a couple of open questions here.
a. what is the true list for bionetworks? Is it tracker?
b. what is the true list for atlas names? In tracker some atlas names are initials (i.e. MSK 1.0, or ORCF 1.0). Do we want to add these names?
c. Projects in portal with no bionetwork: would we like to show None instead of unspecified?
Plan
Since both metadata exist in the project level, we would like to update using @idazucchi 's script which exports only project metadata (don't have to update the state to graph valid, just return to exported). The steps would be:
Select projects (uuids) that need update for NRES
Select projects (uuids) that need bionetwork update & appropriate bionetwork(s)
Select projects (uuids) that need atlas name & version update & appropriate atlas name(s) & version(s)
Write script that via api calls to ingest, will update these informations
Export project metadata via Ida's script
Bulk import form sent to Travis
1,2,3 tasks can be done via the Task tracker spreadsheet
4 script is almost ready for previous bulk update in #1270 (see comments for script) a few modifications might be needed
5 if we provide uuids to script it runs quickly
6 we can also extract project title in order to populate the import form easily
Estimated time needed ~2 days
Risks
information on tracker is not up to date
we will update project or re-run this script for bulk updates in a next release
old project gets error in import validation
drop project from current release & investigate how we can re-export to avoid errors
ask from import team to re-populate staging area with reverse-import script & try again
There are two bulk metadata updates on the project level, that we'd like to do.
Reasoning
data_use_restriction
field in the metadata of all open access projects i.e. all projects of the portal that this update was not done in the previous bulk update in #1270. This would require bumping the project schema version to version 19.0.0 and add the field "data_use_restriction": "NRES" in the project metadata.MSK 1.0
, orORCF 1.0
). Do we want to add these names? c. Projects in portal with no bionetwork: would we like to showNone
instead ofunspecified
?Plan
Since both metadata exist in the project level, we would like to update using @idazucchi 's script which exports only project metadata (don't have to update the state to
graph valid
, just return toexported
). The steps would be:1,2,3 tasks can be done via the Task tracker spreadsheet 4 script is almost ready for previous bulk update in #1270 (see comments for script) a few modifications might be needed 5 if we provide uuids to script it runs quickly 6 we can also extract project title in order to populate the import form easily Estimated time needed ~2 days
Risks