Closed jaclyn-taroni closed 2 years ago
I think this is about as good as we're going to get for separating out the colors!
I've addressed #1403 here now as well.
Note, as of 294b408, this passes CI since #1452 is stacked on this and passes.
So it looks like maybe we missed these LGG samples previously (maybe because of N when forming the original cancer group hex codes) and they should now go into Other low-grade gliomas so we have a complete group for displays. Likewise, the
Embryonal tumor with multilayer rosettes
above would go intoCNS Embryonal tumor
cancer group display. I think you can usebroad_histology_display
to work in this logic. Can you updatecancer_group_display
whenbroad_histology_display
== HGG, LGG, or embryonal to move those samples into the general display groups?
We can do this, but functionally I think this means that cancer_group_display
will be somewhere between broad_histology
and cancer_group
in terms of specificity, instead of a representation of cancer_group
where any cancer group that doesn't meet the sample size threshold is set to Other
(as originally intended).
We're probably going to need a table in figures/README
that explains the mapping:
cancer_group_display |
Included cancer groups |
---|
I'm working on this now, using broad_histology
values to guide the collapsing into Other
categories. Oligodendroglioma
might complicate our plans a bit, as there are cases where samples with Oligodendroglioma
in cancer_group
would end up split into Other low-grade gliomas
and Other high-grade gliomas
.
I'll push what I have shortly so people can see for themselves.
We're going to need a different approach for the oncoprint palettes as well. I'm pushing what I have with that commented out.
Oligodendroglioma
might complicate our plans a bit, as there are cases where samples withOligodendroglioma
incancer_group
would end up split intoOther low-grade gliomas
andOther high-grade gliomas
.
This seems to me like it might be the correct result, at least for the purposes of many of the display figures where we are distinguishing HGG and LGG. The limitation is that we will need to update any joining between the histology file and palette file to use both broad histology and cancer group. But we may need to do that anyway if we are going to make the "Other LGG" and "Other HGG" labels follow expectations in the case where we do not separately label all possible cancer groups within LGG/HGG.
Agree with above, this is expected and we have been joining on both in many of the module updates.
@jaclyn-taroni what do you mean by updates for oncoprint? I was thinking we'd only be adding distinguishing colors for the LGAT tumor plot, but maybe you had something else in mind?
Agree with above, this is expected and we have been joining on both in many of the module updates.
If that is the case, that assuages my concern that this is a bigger bite than I was hoping for. But I think it's still a bigger bite than I was hoping for because...
what do you mean by updates for oncoprint? I was thinking we'd only be adding distinguishing colors for the LGAT tumor plot, but maybe you had something else in mind?
There will need to be a rewrite of how we handle cancer group colors for oncoprints. Now we're collapsing multiple cancer groups into, e.g., "Other low-grade gliomas." Previously, each cancer group under the "Other low-grade gliomas" umbrella would have gotten a random color from a greys palette in the oncoprint. Should we still be showing the individual cancer groups in the oncoprint? My assumption is yes = we need to rewrite how the oncoprint display palette is generated. And then what do we do about the "Low-grade glioma astrocytoma" --> "Other low-grade gliomas" in the oncoprint context? Maybe that's still fine to do.
Should we still be showing the individual cancer groups in the oncoprint? My assumption is yes = we need to rewrite how the oncoprint display palette is generated. And then what do we do about the "Low-grade glioma astrocytoma" --> "Other low-grade gliomas" in the oncoprint context? Maybe that's still fine to do.
I think that we might do something like: instead of the old HGG label, it'll now be "other HGG" and for any groups not colored in the LGG plot (non SEGA, pilocytic, pxa) and which were previously in the general LGG label, that would become "other LGGs". That seems easier than greys for every group, esp if we have a very small N in those groups?
I think that we might do something like: instead of the old HGG label, it'll now be "other HGG" and for any groups not colored in the LGG plot (non SEGA, pilocytic, pxa) and which were previously in the general LGG label, that would become "other LGGs". That seems easier than greys for every group, esp if we have a very small N in those groups?
Okay, to clarify – for the LGAT oncoprint, we'd expect the following groupings:
oncoprint_display (same as cancer_group_display currently) |
cancer_group |
---|---|
Other low-grade gliomas | Low-grade glioma astrocytoma Gliomatosis cerebri Diffuse fibrillary astrocytoma Oligodendroglioma |
Subependymal Giant Cell Astrocytoma | Subependymal Giant Cell Astrocytoma |
Pilocytic astrocytoma | Pilocytic astrocytoma |
Ganglioglioma | Ganglioglioma |
Pleomorphic xanthoastrocytoma | Pleomorphic xanthoastrocytoma |
Is that correct?
Looks right!
Okay, functionally, I believe we no longer need an oncoprint-specific palette at this point. I will take it out in the interest of moving things along, but I imagine we may need to revisit when we revise the oncoprints to reflect the v22 release.
Purpose/implementation Section
We need to accommodate additional cancer groups PAST, PXA, and SEGA (#1368). Here's what I came up with:
⚠️ needs to get updated to use the v22 release