Closed fedorov closed 2 years ago
So this would be a structured record in the clinical_meta table with these 3 columns (collection_id, clinical_table_id, description).
this would be a structured record in the clinical_meta table
@G-White-ISB you probably meant to say clinical_meta_column table, right?
Can we come up with the names that better reflect the content of those tables? Maybe "clinical_meta" can be "table_metadata", and "clinical_meta_column" can be "column_metadata"? I am not saying those are great names, but maybe a bit less confusing.
My comment above was made when there was just the clinical_meta table. clinical_meta_table and clinical_meta_column were invented later. But I'm fine with your recommended name changes
We now have table_metadata and column_metadata tables. Suggest we can close this issue
Current organization of tables has 2 components:
However, in the general case, 1) we will have more than 1 clinical table per collection (with different schemas); 2) we will at least sometime have the need to communicate description of the specific table (table level metadata).
Examples are the NLST collection and ACRIN clinical tables.
I suggest we introduce another level for organization that has the following columns (we could call it
clinical_data_inventory
or something like that?):This follows the approach implemented for ACRIN in https://github.com/fedorov/idc-clinical-cleanup, with the result in https://console.cloud.google.com/bigquery?p=idc-tcia&d=af_clinical_sandbox&page=dataset. There, those table-level metadata attributes are organized in tables per-collection (
<collection_id>_dict
), but we might as well put it all into a single table.