Closed allyhawkins closed 9 months ago
Below I have taken the table we use to assign ontology ID's and assign each diagnosis to a group or category that we can use for plotting. I was a bit more specific about the names used for the groups so used Leukemia
instead of Blood
and Brain and CNS
instead of Brain
.
human_readable_value
associated with the ontology ID's in the figures or the submitted_diagnosis
. Originally I thought the ontology human readable value, but then we don't display that on the front end of the portal so if someone went to go look for a specific sample, I think that would get confusing. Okabe-Ito
color palette? Does anyone have any favorite color palettes that we could choose from? I do not think we should be coloring every single diagnosis. Tagging @jaclyn-taroni @jashapiro @sjspielman for any thoughts.
ontology_id | human_readable_value | submitted_diagnosis | group |
---|---|---|---|
MONDO:0016684 | Anaplastic astrocytoma | Anaplastic astrocytoma | Brain and CNS |
MONDO:0016700 | Anaplastic ependymoma | Anaplastic ependymoma | Brain and CNS |
MONDO:0016734 | Anaplastic ganglioglioma | Anaplastic ganglioglioma | Brain and CNS |
MONDO:0021640 | Grade III glioma | Anaplastic glioma | Brain and CNS |
MONDO:0020560 | Atypical teratoid rhabdoid tumor | Atypical teratoid rhabdoid tumor | Brain and CNS |
MONDO:0022965 | Desmoplastic infantile ganglioglioma | Desmoplastic ganglioglioma | Brain and CNS |
MONDO:0005499 | Brain glioma | Diffuse midline glioma | Brain and CNS |
MONDO:0005505 | Dysembryoplastic neuroepithelial tumor | Dysembryoplastic neuroepithelial tumor | Brain and CNS |
MONDO:0019002 | Lhermitte-Duclos disease | Dysplastic gangliocytoma | Brain and CNS |
MONDO:0016698 | Ependymoma | Ependymoma | Brain and CNS |
MONDO:0019009 | Isolated focal cortical dysplasia | Focal cortical dysplasia | Brain and CNS |
MONDO:0016733 | Ganglioglioma | Ganglioglioma | Brain and CNS |
MONDO:0021193 | Neuroepithelial neoplasm | Ganglioglioma/ATRT | Brain and CNS |
MONDO:0016729 | Mixed neuronal-glial tumor | Glial-neuronal tumor | Brain and CNS |
MONDO:0018177 | Glioblastoma | Glioblastoma | Brain and CNS |
MONDO:0100342 | Malignant glioma | High-grade glioma | Brain and CNS |
MONDO:0858940 | Infant-type hemispheric glioma | Infant-type hemispheric glioma | Brain and CNS |
MONDO:0021637 | Low grade glioma | Low-grade glioma | Brain and CNS |
MONDO:0007959 | Medulloblastoma | Medulloblastoma | Brain and CNS |
MONDO:0016699 | Myxopapillary ependymoma | Myxopapillary ependymoma | Brain and CNS |
MONDO:0016691 | Pilocytic astrocyoma | Pilocytic astrocytoma | Brain and CNS |
MONDO:0016692 | Pilomyxoid astrocytoma | Pilomyxoid astrocytoma | Brain and CNS |
MONDO:0016690 | Pleomorphic xanthoastrocytoma | Pleomorphic xanthoastrocytoma | Brain and CNS |
MONDO:0000640 | Central nervous system primitive neuroectodermal neoplasm | Primitive neuroectodermal tumor | Brain and CNS |
MONDO:0002546 | Schwannoma | Schwannoma | Brain and CNS |
MONDO:0007667 | Subependymoma | Subependymoma | Brain and CNS |
MONDO:0018874 | Acute myeloid leukemia | Acute myeloid leukemia | Leukemia |
MONDO:0004947 | B-cell acute lymphoblastic leukemia | B-cell acute lymphoblastic leukemia | Leukemia |
MONDO:0100291 | Early T-cell progenitor acute lymphoblastic leukemia | Early T-cell precursor T-cell acute lymphoblastic leukemia | Leukemia |
MONDO:0020743 | Mixed phenotype acute leukemia | Mixed phenotype acute leukemia | Leukemia |
MONDO:0004963 | T-cell acute lymphoblastic leukemia | Non-early T-cell precursor T-cell acute lymphoblastic leukemia | Leukemia |
MONDO:0004963 | T-cell acute lymphoblastic leukemia | T-cell acute lymphoblastic leukemia | Leukemia |
MONDO:0020743 | Mixed phenotype acute leukemia | T-myeloid mixed phenotype acute leukemia | Leukemia |
MONDO:0006639 | Adrenal cortex carcinoma | Adrenocortical carcinoma | Other solid tumors |
MONDO:0011719 | Gastrointestinal stromal tumor | Gastrointestinal stromal tumor | Other solid tumors |
MONDO:0005040 | Germ cell tumor | Germ cell tumor | Other solid tumors |
MONDO:0008380 | Retinoblastoma | Retinoblastoma | Other solid tumors |
MONDO:0002728 | Rhabdoid tumor | Rhabdoid tumor | Other solid tumors |
MONDO:0006058 | Wilms tumor | Wilms tumor | Other solid tumors |
MONDO:0005072 | Neuroblastoma | Neuroblastoma | Other solid tumors |
MONDO:0002926 | Clear cell sarcoma | Clear cell sarcoma | Sarcoma |
MONDO:0005006 | Clear cell sarcoma of the kidney | Clear cell sarcoma of the kidney | Sarcoma |
MONDO:0019373 | Desmoplastic small round cell tumor | Desmoplastic small round cell tumor | Sarcoma |
MONDO:0015795 | Undifferentiated embryonal sarcoma of the liver | Embryonal sarcoma | Sarcoma |
MONDO:0017387 | Epithelioid sarcoma | Epithelioid sarcoma | Sarcoma |
MONDO:0012817 | Ewing sarcoma | Ewing sarcoma | Sarcoma |
MONDO:0004557 | Congenital fibrosarcoma | Infantile fibrosarcoma | Sarcoma |
MONDO:0009807 | Osteosarcoma | Osteosarcoma | Sarcoma |
MONDO:0005212 | Rhabdomyosarcoma | Rhabdomyosarcoma | Sarcoma |
MONDO:0002927 | Spindle cell sarcoma | Spindle sarcoma | Sarcoma |
MONDO:0010434 | Synovial sarcoma | Synovial sarcoma | Sarcoma |
MONDO:0020661 | Undifferentiated round cell sarcoma | Undifferentiated round cell sarcoma | Sarcoma |
PATO:0000461 | Normal | Non-cancerous |
maybe as Josh suggested we show the bar plot of every single diagnosis in the supplementals and in that plot we could include the normal.
This makes sense to me.
Originally I thought the ontology human readable value, but then we don't display that on the front end of the portal so if someone went to go look for a specific sample, I think that would get confusing.
How about we match the portal in the main text, but toss a ontology human-readable version of the same figure into the supp? Edit - maybe this is the same figure as the normal/non-cancerous go into?
If we only have four groups here then are we okay with picking four colors from the Okabe-Ito color palette? Does anyone have any favorite color palettes that we could choose from?
I don't know about using Okabe-Ito, specifically/only because we might want to use Okabe-Ito for the UMAPs. If we don't want to reserve this palette for UMAPs then it definitely seems fine to use here! In terms of other palettes, I generally like Dark2
or a discrete version of the regular viridis
, but I really don't feel strongly either way.
I don't know about using Okabe-Ito, specifically/only because we might want to use Okabe-Ito for the UMAPs. If we don't want to reserve this palette for UMAPs then it definitely seems fine to use here! In terms of other palettes, I generally like Dark2 or a discrete version of the regular viridis, but I really don't feel strongly either way.
I think this is a really good point! If we want to avoid overlapping palettes with any future figures, then we probably want to pick from a palette we aren't likely to use again. I feel like Dark2
could fall into that category too... What about the first 4 colors I circled from Set1
?
Another thought, we could pick 4 colors from the cancer group colors that were used in OpenPBTA...
To comment on color palette only:
I don't believe Dark2
or Set1
are particularly accessible, and the yellow in viridis
ends up low contrast with a white background, in my opinion:
What about hcl.colors(4, "cividis")
: #00214E
#545C71
#A89F76
#FFE93F
If we're worried about the contrast with the brightest yellow, we could do hcl.colors(5, "cividis")[1:4]
: #00214E
#3E4C6E
#7C7C7C
#BFB170
Or hcl.colors(5, "batlow")[1:4]
: #201158
#005E5E
#578B21
#E89E6B
Or palette.colors(n = 5, palette = "R4")[-1]
: #DF536B
#61D04F
#2297E6
#28E2E5
Regardless of the palette we choose, we should try to encode this information with facets or panels as well!
Edited to add: I would include a grey color for normals in this palette. The worst thing that happens is you don't use it.
I don't believe
Dark2
orSet1
are particularly accessible,
Just noting that of the brewer palettes, only Dark2, Set2, and Paired are CVD-friendly. We should try to stick with something that is generally accessible, but I agree that Dark2 can sometimes be hard to see!
I'm going to respond to @allyhawkins 3 points all in one place.
I would use submitted_diagnosis
for the reason you give – we want people to be able to reference data from the Portal.
Add a non-cancerous group. Pick a grey color for non-cancerous samples. We'll want to check it for contrast with whatever the other 4 groups' colors end up being. If we don't end up needing it, there's very little harm in having it!
I compiled my thoughts on palettes here https://github.com/AlexsLemonade/scpca-paper-figures/issues/10#issuecomment-1863456338. And to record something discussed on Slack:
Just noting that of the brewer palettes, only Dark2, Set2, and Paired are CVD-friendly. We should try to stick with something that is generally accessible, but I agree that Dark2 can sometimes be hard to see!
Once you are at n = 4, only Paired meets this criterion.
We should prioritize a palette with darker colors because, if we plan to use it throughout the paper, we'll want the flexibility to color points on a white background.
Closing this and any additional discussion can happen in #20
For figure 1, we will want to create different groups for all the diagnoses so that we can facet and color by group. This issue should be used for a discussion to figure out the following two things: