AlexsLemonade / scpca-paper-figures

Figures for https://github.com/AlexsLemonade/ScPCA-manuscript/
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Decide on groupings for all diagnoses #10

Closed allyhawkins closed 9 months ago

allyhawkins commented 10 months ago

For figure 1, we will want to create different groups for all the diagnoses so that we can facet and color by group. This issue should be used for a discussion to figure out the following two things:

  1. How to divide the diagnoses. I will post a table with diagnoses in one column and group in the other column for everyone to check and decide on how exactly we want to label each group. One thought is we should use the ontology ID's for diagnosis instead of the submitter provided diagnoses?
  2. We will need to decide on a color palette and which color to use for each diagnoses.
allyhawkins commented 10 months ago

Below I have taken the table we use to assign ontology ID's and assign each diagnosis to a group or category that we can use for plotting. I was a bit more specific about the names used for the groups so used Leukemia instead of Blood and Brain and CNS instead of Brain.

  1. Do we want to display the human_readable_value associated with the ontology ID's in the figures or the submitted_diagnosis. Originally I thought the ontology human readable value, but then we don't display that on the front end of the portal so if someone went to go look for a specific sample, I think that would get confusing.
  2. What do we do about the Normal/non-cancerous samples? I don't think they really belong in a category, so I wonder if we want to not include them in the grouped plot, but maybe as Josh suggested we show the bar plot of every single diagnosis in the supplementals and in that plot we could include the normal.
  3. If we only have four groups here then are we okay with picking four colors from the Okabe-Ito color palette? Does anyone have any favorite color palettes that we could choose from? I do not think we should be coloring every single diagnosis.

Tagging @jaclyn-taroni @jashapiro @sjspielman for any thoughts.

ontology_id human_readable_value submitted_diagnosis group
MONDO:0016684 Anaplastic astrocytoma Anaplastic astrocytoma Brain and CNS
MONDO:0016700 Anaplastic ependymoma Anaplastic ependymoma Brain and CNS
MONDO:0016734 Anaplastic ganglioglioma Anaplastic ganglioglioma Brain and CNS
MONDO:0021640 Grade III glioma Anaplastic glioma Brain and CNS
MONDO:0020560 Atypical teratoid rhabdoid tumor Atypical teratoid rhabdoid tumor Brain and CNS
MONDO:0022965 Desmoplastic infantile ganglioglioma Desmoplastic ganglioglioma Brain and CNS
MONDO:0005499 Brain glioma Diffuse midline glioma Brain and CNS
MONDO:0005505 Dysembryoplastic neuroepithelial tumor Dysembryoplastic neuroepithelial tumor Brain and CNS
MONDO:0019002 Lhermitte-Duclos disease Dysplastic gangliocytoma Brain and CNS
MONDO:0016698 Ependymoma Ependymoma Brain and CNS
MONDO:0019009 Isolated focal cortical dysplasia Focal cortical dysplasia Brain and CNS
MONDO:0016733 Ganglioglioma Ganglioglioma Brain and CNS
MONDO:0021193 Neuroepithelial neoplasm Ganglioglioma/ATRT Brain and CNS
MONDO:0016729 Mixed neuronal-glial tumor Glial-neuronal tumor Brain and CNS
MONDO:0018177 Glioblastoma Glioblastoma Brain and CNS
MONDO:0100342 Malignant glioma High-grade glioma Brain and CNS
MONDO:0858940 Infant-type hemispheric glioma Infant-type hemispheric glioma Brain and CNS
MONDO:0021637 Low grade glioma Low-grade glioma Brain and CNS
MONDO:0007959 Medulloblastoma Medulloblastoma Brain and CNS
MONDO:0016699 Myxopapillary ependymoma Myxopapillary ependymoma Brain and CNS
MONDO:0016691 Pilocytic astrocyoma Pilocytic astrocytoma Brain and CNS
MONDO:0016692 Pilomyxoid astrocytoma Pilomyxoid astrocytoma Brain and CNS
MONDO:0016690 Pleomorphic xanthoastrocytoma Pleomorphic xanthoastrocytoma Brain and CNS
MONDO:0000640 Central nervous system primitive neuroectodermal neoplasm Primitive neuroectodermal tumor Brain and CNS
MONDO:0002546 Schwannoma Schwannoma Brain and CNS
MONDO:0007667 Subependymoma Subependymoma Brain and CNS
MONDO:0018874 Acute myeloid leukemia Acute myeloid leukemia Leukemia
MONDO:0004947 B-cell acute lymphoblastic leukemia B-cell acute lymphoblastic leukemia Leukemia
MONDO:0100291 Early T-cell progenitor acute lymphoblastic leukemia Early T-cell precursor T-cell acute lymphoblastic leukemia Leukemia
MONDO:0020743 Mixed phenotype acute leukemia Mixed phenotype acute leukemia Leukemia
MONDO:0004963 T-cell acute lymphoblastic leukemia Non-early T-cell precursor T-cell acute lymphoblastic leukemia Leukemia
MONDO:0004963 T-cell acute lymphoblastic leukemia T-cell acute lymphoblastic leukemia Leukemia
MONDO:0020743 Mixed phenotype acute leukemia T-myeloid mixed phenotype acute leukemia Leukemia
MONDO:0006639 Adrenal cortex carcinoma Adrenocortical carcinoma Other solid tumors
MONDO:0011719 Gastrointestinal stromal tumor Gastrointestinal stromal tumor Other solid tumors
MONDO:0005040 Germ cell tumor Germ cell tumor Other solid tumors
MONDO:0008380 Retinoblastoma Retinoblastoma Other solid tumors
MONDO:0002728 Rhabdoid tumor Rhabdoid tumor Other solid tumors
MONDO:0006058 Wilms tumor Wilms tumor Other solid tumors
MONDO:0005072 Neuroblastoma Neuroblastoma Other solid tumors
MONDO:0002926 Clear cell sarcoma Clear cell sarcoma Sarcoma
MONDO:0005006 Clear cell sarcoma of the kidney Clear cell sarcoma of the kidney Sarcoma
MONDO:0019373 Desmoplastic small round cell tumor Desmoplastic small round cell tumor Sarcoma
MONDO:0015795 Undifferentiated embryonal sarcoma of the liver Embryonal sarcoma Sarcoma
MONDO:0017387 Epithelioid sarcoma Epithelioid sarcoma Sarcoma
MONDO:0012817 Ewing sarcoma Ewing sarcoma Sarcoma
MONDO:0004557 Congenital fibrosarcoma Infantile fibrosarcoma Sarcoma
MONDO:0009807 Osteosarcoma Osteosarcoma Sarcoma
MONDO:0005212 Rhabdomyosarcoma Rhabdomyosarcoma Sarcoma
MONDO:0002927 Spindle cell sarcoma Spindle sarcoma Sarcoma
MONDO:0010434 Synovial sarcoma Synovial sarcoma Sarcoma
MONDO:0020661 Undifferentiated round cell sarcoma Undifferentiated round cell sarcoma Sarcoma
PATO:0000461 Normal Non-cancerous
sjspielman commented 10 months ago

maybe as Josh suggested we show the bar plot of every single diagnosis in the supplementals and in that plot we could include the normal.

This makes sense to me.

Originally I thought the ontology human readable value, but then we don't display that on the front end of the portal so if someone went to go look for a specific sample, I think that would get confusing.

How about we match the portal in the main text, but toss a ontology human-readable version of the same figure into the supp? Edit - maybe this is the same figure as the normal/non-cancerous go into?

If we only have four groups here then are we okay with picking four colors from the Okabe-Ito color palette? Does anyone have any favorite color palettes that we could choose from?

I don't know about using Okabe-Ito, specifically/only because we might want to use Okabe-Ito for the UMAPs. If we don't want to reserve this palette for UMAPs then it definitely seems fine to use here! In terms of other palettes, I generally like Dark2 or a discrete version of the regular viridis, but I really don't feel strongly either way.

allyhawkins commented 10 months ago

I don't know about using Okabe-Ito, specifically/only because we might want to use Okabe-Ito for the UMAPs. If we don't want to reserve this palette for UMAPs then it definitely seems fine to use here! In terms of other palettes, I generally like Dark2 or a discrete version of the regular viridis, but I really don't feel strongly either way.

I think this is a really good point! If we want to avoid overlapping palettes with any future figures, then we probably want to pick from a palette we aren't likely to use again. I feel like Dark2 could fall into that category too... What about the first 4 colors I circled from Set1?

Screenshot 2023-12-19 at 2 49 16 PM

Another thought, we could pick 4 colors from the cancer group colors that were used in OpenPBTA...

jaclyn-taroni commented 10 months ago

To comment on color palette only:

I don't believe Dark2 or Set1 are particularly accessible, and the yellow in viridis ends up low contrast with a white background, in my opinion:

What about hcl.colors(4, "cividis"): #00214E #545C71 #A89F76 #FFE93F

If we're worried about the contrast with the brightest yellow, we could do hcl.colors(5, "cividis")[1:4]: #00214E #3E4C6E #7C7C7C #BFB170

Or hcl.colors(5, "batlow")[1:4]: #201158 #005E5E #578B21 #E89E6B

Or palette.colors(n = 5, palette = "R4")[-1]: #DF536B #61D04F #2297E6 #28E2E5

Regardless of the palette we choose, we should try to encode this information with facets or panels as well!

Edited to add: I would include a grey color for normals in this palette. The worst thing that happens is you don't use it.

sjspielman commented 10 months ago

I don't believe Dark2 or Set1 are particularly accessible,

Just noting that of the brewer palettes, only Dark2, Set2, and Paired are CVD-friendly. We should try to stick with something that is generally accessible, but I agree that Dark2 can sometimes be hard to see!

jaclyn-taroni commented 10 months ago

I'm going to respond to @allyhawkins 3 points all in one place.

  1. I would use submitted_diagnosis for the reason you give – we want people to be able to reference data from the Portal.

  2. Add a non-cancerous group. Pick a grey color for non-cancerous samples. We'll want to check it for contrast with whatever the other 4 groups' colors end up being. If we don't end up needing it, there's very little harm in having it!

  3. I compiled my thoughts on palettes here https://github.com/AlexsLemonade/scpca-paper-figures/issues/10#issuecomment-1863456338. And to record something discussed on Slack:

    Just noting that of the brewer palettes, only Dark2, Set2, and Paired are CVD-friendly. We should try to stick with something that is generally accessible, but I agree that Dark2 can sometimes be hard to see!

    Once you are at n = 4, only Paired meets this criterion.

    We should prioritize a palette with darker colors because, if we plan to use it throughout the paper, we'll want the flexibility to color points on a white background.

allyhawkins commented 9 months ago

Closing this and any additional discussion can happen in #20