Right now, figures/mapping-histology-labels.Rmd creates a display_group and cancer_group palette by randomly assigning colors from large sets of hex codes. display_group, which is based on broad_histology, is broader than cancer_group and all tumors from a cancer_group will map to a single broad_histologyhttps://github.com/AlexsLemonade/OpenPBTA-analysis/issues/1174#issuecomment-915372913.
Because it will be difficult to create palettes to accommodate all the cancer_group labels, we have decided to:
Create a broad_histology palette with an individual hex code for all broad_histology values where at least one cancer_group under that broad_histology has n >= 10.
Create a cancer_group palette with an individual hex code for all cancer_group with n >= 10, where the colors will be related to the broad_histology but vary in brightness/saturation/hue such that they can be distinguished from one another (to the best of our ability)
All other labels in broad_histology or cancer_group will map to Other and be colored as grey
Next steps
Here are the steps that I am aware of right now:
Update figures/mapping-histology-labels.Rmd -- we don't need to randomly select colors and we also don't need to have the output of this (currently figures/palettes/histology_label_color_table.tsv) have a row for every specimen at this stage in the project. Instead, I think we can create a much more minimal table with the following columns:
broad_histology
cancer_group
broad_histology_hex
cancer_group_hex
Beyond the values that get individual hex codes that I will list in the section below, we will also list Other and NA hex codes here. My idea here is to have logic in all of our viz code that basically checks if there are matching values for broad_histology and/or cancer_group and if matches are not found, recode to Other. (Depending on what I find in the next steps, this plan might change.)
Update figures/README.md with new instructions for using this palette.
Replace all the instances of how we currently use the histology_label_color_table with this new method (I very much expect breaking changes).
This perhaps should get broken out in some way depending on the scope of the changes.
Context
Related to #1174
Right now,
figures/mapping-histology-labels.Rmd
creates adisplay_group
andcancer_group
palette by randomly assigning colors from large sets of hex codes.display_group
, which is based onbroad_histology
, is broader thancancer_group
and all tumors from acancer_group
will map to a singlebroad_histology
https://github.com/AlexsLemonade/OpenPBTA-analysis/issues/1174#issuecomment-915372913.Because it will be difficult to create palettes to accommodate all the
cancer_group
labels, we have decided to:broad_histology
palette with an individual hex code for allbroad_histology
values where at least onecancer_group
under thatbroad_histology
has n >= 10.cancer_group
palette with an individual hex code for allcancer_group
with n >= 10, where the colors will be related to thebroad_histology
but vary in brightness/saturation/hue such that they can be distinguished from one another (to the best of our ability)broad_histology
orcancer_group
will map toOther
and be colored as greyNext steps
Here are the steps that I am aware of right now:
Update
figures/mapping-histology-labels.Rmd
-- we don't need to randomly select colors and we also don't need to have the output of this (currentlyfigures/palettes/histology_label_color_table.tsv
) have a row for every specimen at this stage in the project. Instead, I think we can create a much more minimal table with the following columns:broad_histology
cancer_group
broad_histology_hex
cancer_group_hex
Beyond the values that get individual hex codes that I will list in the section below, we will also list
Other
andNA
hex codes here. My idea here is to have logic in all of our viz code that basically checks if there are matching values forbroad_histology
and/orcancer_group
and if matches are not found, recode toOther
. (Depending on what I find in the next steps, this plan might change.)Update
figures/README.md
with new instructions for using this palette.Replace all the instances of how we currently use the
histology_label_color_table
with this new method (I very much expect breaking changes).I am planning to complete these steps myself.
Palettes
From https://github.com/AlexsLemonade/OpenPBTA-analysis/issues/1174#issuecomment-916282123
broad_histology
#590024
#ff80e5
#220040
#2200ff
#0074d9
#8f8fbf
#2db398
#7fbf00
#685815
#ffaa00
#b2502d
broad_histology
cancer_group
#592d3e
#ffccf5
#ff40d9
#bf0099
#4d0d85
#7739ad
#9426fb
#2200ff
#058aff
#8c8cff
#000080
#2db398
#9fbf60
#614e01
#e6ac39
#ab7200
#b33000