It's getting to be quite limiting that we can't easily access the biosample metadata from the sample or assembly tables in Terra using the current data model that our WDLs create. We should pursue one of the following solutions (or something like it)
demux_deplete populates the sample table with columns from biosample_attributes_tsv
demux_deplete populates the sample table with a json object containing all the biosample attributes from only the relevant row of the tsv corresponding to this sample
demux_deplete emits a tsv output file that is a slightly transformed version of the biosample_attributes_tsv, the main difference being that it contains one more column that corresponds to the sample_id of the sample table (ie, the "sanitized" sample name with dashes and underscores and removing any slashes or spaces from the real/original sample name) -- currently the original biosample_attributes_tsv only has the unsanitized / external facing sample id. Then the user can simply use terra_tsv_to_table to update the sample table themselves (this would require updating terra_tsv_to_table to accept arbitrary columns as the index column by rewriting the column header with the requisite entity: stuff on the fly)
It's getting to be quite limiting that we can't easily access the biosample metadata from the sample or assembly tables in Terra using the current data model that our WDLs create. We should pursue one of the following solutions (or something like it)
demux_deplete
populates thesample
table with columns from biosample_attributes_tsvdemux_deplete
populates thesample
table with a json object containing all the biosample attributes from only the relevant row of the tsv corresponding to this sampledemux_deplete
emits a tsv output file that is a slightly transformed version of the biosample_attributes_tsv, the main difference being that it contains one more column that corresponds to thesample_id
of thesample
table (ie, the "sanitized" sample name with dashes and underscores and removing any slashes or spaces from the real/original sample name) -- currently the original biosample_attributes_tsv only has the unsanitized / external facing sample id. Then the user can simply useterra_tsv_to_table
to update thesample
table themselves (this would require updatingterra_tsv_to_table
to accept arbitrary columns as the index column by rewriting the column header with the requisiteentity:
stuff on the fly)