Closed dhimmel closed 6 years ago
Did you run 4.covariates.ipynb in #41?
Nope, I only ran scripts 0
, 1
, 2
I reran the pipeline and hit failure in 3.explore-mutations.ipynb. It looks like the disease column has been removed from samples.tsv. I expect this to cause many downstream issues. Why was this removed?
That is great to know - the updated clinical data stores this information (sort of) in the histological_type
variable.
However, I think this can be added relatively easily. This can be done with similar logic in cell 7
:
# Extract sample-type with the code dictionary
clinmat_df = clinmat_df.assign(sample_type = clinmat_df.sample_id.str[-2:])
clinmat_df.sample_type = clinmat_df.sample_type.replace(sampletype_codes_dict)
Except map acronym to disease.
the updated clinical data stores this information (sort of) in the histological_type variable.
There is more granular detail in this variable now - it looks like it has some subtype and treatment status characterization
looks great @dhimmel - thanks for updating fully!
Follows up on #41.
@gwaygenomics I reran the pipeline and hit failure in 3.explore-mutations.ipynb. It looks like the
disease
column has been removed fromsamples.tsv
. I expect this to cause many downstream issues. Why was this removed?Did you run 4.covariates.ipynb in #41?