Closed hehouts closed 2 years ago
This is described in chunk s1
in the metadata summary.
Here I am frantically looking for the column that its matching 76C9Y
with ctrl-f
to external_id
HSM6XRR7
reducing the number of columns to char columns.
df <- ibdmdb_mvx_only %>%
filter(external_id == "HSM6XRR7") %>%
select(where(is.character))
df2 <- df[!map_lgl(df, ~ all(is.na(.)))]
df2
df2
only has 131 columns, so I just clicked through them.
tube_a_viromics
!!!!which matches the exact virome filename: SM-76C9Y
looking at these "tube" columns might be important later.
ibdmdb %>%
select(contains("tube")) %>%
select(where(is.character)) %>%
drop_na()
These are the tube
string match containing columns:
"serum_tube_number_1_received_at_csmc"
"serum_tubes_number_2_4_received_at_mgh"
"tube_a_dna_rna""tube_a_metabolomics"
"tube_a_storage"
"tube_a_viromics"
"tube_b_fecal_calprotectin"
"tube_b_proteomics"
"stool_sample_id_tube_a_et_oh"
"sample_id_tube_b_no_preservative"
"tube_a_and_b_received_at_broad"
This is described in chunk s1
in the metadata summary Rmd
Part of issue #2
Virome files look like:
SM-76C9Y.tar
,SM-9SIJC.tar
,SM-7M8RR.tar
and I cant join them with the IBDMDB metadata table,project
id number,external_id
,site_sub_coll
id, or anything else obvious.If I search the suffix of a virome file name, e.g.
76C9Y
fromSM-76C9Y.tar
on the ibdmdb_mvx_only metadata (the ibdmdb metadata, filtered for viromes) it does return a match.What is the column that it is finding a match for??