humlab-sead / sead_change_control

Sane SEAD change control using Sqitch.
1 stars 0 forks source link

aDNA pilot project data import 20241114 #329

Open MattiasSealander opened 1 week ago

MattiasSealander commented 1 week ago

This is the data prepared for import as part of SciLifeLab's pilot project. Lookup tables are colored purple, data tables are not colored. Original data tables that were used as basis are left in and colored yellow.

First attempt will be to import using local system_id for the lookup data. This means that in some tables, like tbl_site_locations, there are references made using SEAD lookup ID and local system_id for tbl_locations, for instance. I believe that there should be no overlap in these cases, where a local system_id is the same as a SEAD lookup ID in the same table and field. New lookup data are left in the file, so checking for an ID there first, and then if not found, looking for the ID among SEADs lookup ID could be a way to do it.

I can prepare a separate lookup file for the aDNA data as I discovered new lookups that were necessary to add on top of the last uploaded version.

Still waiting for response from SciLifeLab regarding whether methods metadata and data repository links will differ between samples and libraries. This will affect where these are stored in SEAD, since if they are the same for samples and libraries they can possible be added as sample descriptions, rather than analysis values. Otherwise they need to be analysis values since libraries are distinguished at a dataset level. We don't have any links yet though, so they are not included in this data import.

Double-check naming of fields in new analysis_values tables to make sure that they are consistent with what has been implemented.

Data SEAD_aDNA_data_20241114.xlsx