Closed jharenza closed 2 years ago
Similar question to other PRs - are changes we're seeing in the following files expected? Are the due to changes in the base file?
analyses/molecular-subtyping-EPN/results/EPN_all_data.tsv
analyses/molecular-subtyping-EPN/results/EPN_all_data_withsubgroup.tsv
No updates here upon rerun with new base, so column order does not matter.
Similar question to other PRs - are changes we're seeing in the following files expected? Are the due to changes in the base file?
analyses/molecular-subtyping-EPN/results/EPN_all_data.tsv analyses/molecular-subtyping-EPN/results/EPN_all_data_withsubgroup.tsv
OK, so most of the changes come from the new focal CN file which has the NAs. The 7316-384
which went from EPN, PFA to EPN, YAP1, we also saw this in OpenPedCan. However, this should not be the case- we shouldn't have assigned a fusion-positive subtype without the fusion. This should also be the case for RELA fusions. I submitted an issue to update this #1438
This is now ready for review with the changes in notebook 03 made my @ewafula.
I can not figure out what the relevant change is to the
03
notebook because Jupyter notebooks sure do not play nicely with git, so I'll have to take your word for it (and you don't need me to explicitly approve this since it's not going intomaster
)
Don't we want these changes to go into master, on the off chance we would ever have to rerun any of this?
You’re merging into v22-cranio
here; you don’t need my approval to do that, that branch isn’t protected. You will need my approval once we get to the “bottom” of the stack and everything goes into master
That's what I meant- that all of these PRs would make it to master. Was worried that we would lose the EPN update, but thanks for clarifying!
To that end, if you can post the line number(s) or a permalink to the line(s) where the substantive change was made, that'd be super helpful!
@ewafula can you take care of this please?
@jaclyn-taroni, the substantial change to the notebook was mainly exclusion of code that considers other molecular markers and calls tumor YAP1 positive without the YAP1 fusion as described in ticket #1438 by @jharenza and also discussed here. To that end:
1). I excluded the following function and calls to the function that assigns overexpression of CXorf67
and TKTL1
along with 1q gain
to the PT_EPN_A
subgroup and overexpression GPBP1
and IFT46
along with 6p and 6q loss
to the PT_EPN_B
subgroup:
def prioritizing_PT_EPN(row, sample_list):
if( row["CXorf67_expr_zscore"]>3 or
(row["CXorf67_expr_zscore"]>3 and row["1q_gain"]>0) or
(row["TKTL1_expr_zscore"]>3 and row["1q_gain"]>0)):
sample_list.append(row["sample_id"])
return("EPN, PF A")
elif((row["GPBP1_expr_zscore"]>3 and row["6q_loss"]>0) or
(row["GPBP1_expr_zscore"]>3 and row["6p_loss"]>0) or
(row["IFT46_expr_zscore"]>3 and row["6q_loss"]>0) or
(row["IFT46_expr_zscore"]>3 and row["6p_loss"]>0)):
sample_list.append(row["sample_id"])
return("EPN, PF B")
else:
return(row["subgroup"])
2). I also excluded the following code that uses a combination of markers, not including RELA
and YAP2
fusions, with thresholds higher than the set tuple values to assign EPN, ST RELA
and EPN, ST YAP1
subgroups :
For assigning EPN, ST RELA
subgroup
st_epn_rela_tests = [("PTEN--TAS2R1", 0),
("9p_loss", 0),
("9q_loss", 0),
("RELA_expr_zscore", 3),
("L1CAM_expr_zscore",3)]
# Calling function subgroup_func to set the values for last column "subgroup"
EPN_final["subgroup"] = EPN_final.apply(subgroup_func,
axis=1,
subgroupname="EPN, ST RELA",
column_values=st_epn_rela_tests,
sample_list=samples_assigned)
For assigning EPN, ST YAP1
subgroup
st_epn_yap1_tests = [("C11orf95--MAML2", 0),
("11q_loss", 0),
("11q_gain", 0),
("ARL4D_expr_zscore", 3),
("CLDN1_expr_zscore", 3)]
EPN_final["subgroup"] = EPN_final.apply(subgroup_func,
axis=1,
subgroupname="EPN, ST YAP1",
column_values=st_epn_yap1_tests,
sample_list=samples_assigned)
The final results table including subgroups is aligns with the subtyping table produced by @komalsrathi for the same module in the OpenPedCan repo: https://github.com/PediatricOpenTargets/OpenPedCan-analysis/blob/mol-subtype-update/analyses/molecular-subtyping-EPN/results/EPN_all_data_withsubgroup.tsv
Purpose/implementation Section
What scientific question is your analysis addressing?
Run EPN subtyping
What was your approach?
What GitHub issue does your pull request address?
1207
Directions for reviewers. Tell potential reviewers what kind of feedback you are soliciting.
Which areas should receive a particularly close look?
NA
Is there anything that you want to discuss further?
Is the analysis in a mature enough form that the resulting figure(s) and/or table(s) are ready for review?
Results
What types of results are included (e.g., table, figure)?
What is your summary of the results?
Reproducibility Checklist
Documentation Checklist
README
and it is up to date.analyses/README.md
and the entry is up to date.