d3b-center / hope-cohort-analysis

Analysis for HOPE cohort
3 stars 1 forks source link

October 2023 figures updates #77

Closed mkoptyra closed 9 months ago

mkoptyra commented 11 months ago

per Nicole: ''' please help me out by following these guidelines

Use Arial font. Otherwise, you must figure out how to embed your fonts. To embed fonts, you might follow this procedure: (1) Open the file you want to embed fonts in. (2) On the application (PowerPoint or Word) menu, select Preferences. (3) In the dialog box, under Output and Sharing, select Save.(4) Under Font Embedding, select Embed fonts in the file.

Prefer vectorized file formats (e.g., PDF or SVG) whenever possible. Vectorized formats maintain individual object distinctions allowing for independent modification. Avoid settings such as use_raster = TRUE in preparing heatmaps and avoid using ggrastr unless necessary. Especially avoid rasterizing text.

When saving figures in a vectorized format, avoid using Dingbats. For example, in R, you can save your files like this: pdf(file = "myfile.pdf", useDingBats = FALSE)

If used, color should be encoded as RGB, and to accommodate all viewers, red and green should not be used together '''

mkoptyra commented 11 months ago

HOPE_color_scheme.xlsx

mkoptyra commented 11 months ago

compare_HOPE_v2plot annotation.txt

mkoptyra commented 11 months ago

@Komal - could you rerun these according to updated cohort annotation

cascade plots : https://github.com/d3b-center/hope-cohort-analysis/tree/master/analyses/oncoplots/results/cascade_plots_add_tumor_only

gene correlation analysis: https://github.com/d3b-center/hope-cohort-analysis/tree/master/analyses/oncoplots/results/correlation_analysis

ALT status: https://github.com/d3b-center/hope-cohort-analysis/tree/master/analyses/alt-analysis/results

Circos plot with grade annotation on histology https://github.com/d3b-center/hope-cohort-analysis/blob/master/analyses/data-availability/results/hope_cohort_data_availability_clinical_v2.pdf

komalsrathi commented 11 months ago

Yes, will do. When do you need this by?

mkoptyra commented 11 months ago

Could you do this by Friday morning?

komalsrathi commented 11 months ago

Ok, I have added the annotation file in the code where applicable to filter out samples. Instead of the master branch, can you check the results on the rerun-analyses branch: https://github.com/d3b-center/hope-cohort-analysis/tree/rerun-analyses? That's where I have been making all recent updates.

Here are corresponding links for easy access:

cascade plots: https://github.com/d3b-center/hope-cohort-analysis/tree/rerun-analyses/analyses/oncoplots/results/cascade_plots_add_tumor_only

gene correlation analysis: https://github.com/d3b-center/hope-cohort-analysis/tree/rerun-analyses/analyses/oncoplots/results/correlation_analysis

ALT status: https://github.com/d3b-center/hope-cohort-analysis/tree/rerun-analyses/analyses/alt-analysis/results

Circos plot with grade annotation on histology: https://github.com/d3b-center/hope-cohort-analysis/blob/rerun-analyses/analyses/data-availability/results/hope_clinical_data_availability_age_continuous.pdf

mkoptyra commented 10 months ago

@komalsrathi does this analysis remove subtypes with <3 cases? https://github.com/d3b-center/hope-cohort-analysis/blob/rerun-analyses/analyses/survival-analysis/plots/t_n_telomere_content_vs_survival_multivariate.pdf

And same question for any HR_ evaluations in this folder https://github.com/d3b-center/hope-cohort-analysis/tree/rerun-analyses/analyses/survival-analysis/plots

komalsrathi commented 10 months ago

Yes they have been removed for subtypes < 3 cases. The survival analysis was given an OK by @jharenza.

mkoptyra commented 10 months ago

@komalsrathi Could I ask for cascade plot with tumor only by age with two age groups (there is plot already with three age groups)? https://github.com/d3b-center/hope-cohort-analysis/tree/rerun-analyses/analyses/oncoplots/results/cascade_plots_add_tumor_only

komalsrathi commented 10 months ago

This is super short notice but I will try to do it in the morning. Thanks.

komalsrathi commented 10 months ago

@mkoptyra I have added the additional plots in the same folder with the suffix _two_age_groups.pdf

mkoptyra commented 10 months ago

Thank you @komalsrathi - is the cut-off for the somatic 6% cohort frequency?

mkoptyra commented 10 months ago

@komalsrathi - thank you so much for ALT and age/sex correlation with adjusted cohort. Seems that there is age correlation with two groups. Do you recall if we had that significant before (not super important but curious since i don't recall) ?

jharenza commented 10 months ago

@komalsrathi - thank you so much for ALT and age/sex correlation with adjusted cohort.

Seems that there is age correlation with two groups.

Do you recall if we had that significant before (not super important but curious since i don't recall) ?

Please also remember that subtypes are enriched in certain age groups so this by default makes the conclusions age-related, but I'm inclined to say these are tumor intrinsic rather than age intrinsic factors. We would need additional analyses - within age group analyses, I believe, to tease out specific questions related to phenotypes. So while we can say "ALT status is different between age groups"- we are not comparing apples to apples here. It's not all HGG, H3 wildtype here.

We can review this next week before we present anything in the hope meetings - please let's review all before any discussions in hope meetings

mkoptyra commented 10 months ago

It is telemore content (not the ALT status) and the main question of the first figure is related to any features of the age related aspects. I see that as transparently indicated tendency, not necessary a big statement.

mkoptyra commented 10 months ago

@komalsrathi For the oncoplot, and somatci calls presented on the figure can I ask to change the genes to top 20? ( I think currently there is >6%)

And could you add annotation with the genomic subtypes?

komalsrathi commented 10 months ago

I have added top 20 alterations to the oncoplots + added molecular subtypes to the annotation.

mkoptyra commented 10 months ago

Thank you so much. This looks great - I added some flags :) cascade_orderby_age_two_age_groups=FLAGs copy

mkoptyra commented 10 months ago

@komalsrathi few more questions, comments:

  1. We will need additional similar to the oncoplot you created (like the one which is base for the above picture): https://github.com/d3b-center/hope-cohort-analysis/blob/rerun-analyses/analyses/oncoplots/results/cascade_plots_add_tumor_only/cascade_orderby_age_two_age_groups.pdf with all the oncogenes on the list (not only top 20). That long version of figure will go to the supplementary data for the Hope manuscript.

  2. For the following graph: https://github.com/d3b-center/hope-cohort-analysis/blob/rerun-analyses/analyses/alt-analysis/results/telomere_content_vs_age_two_groups.pdf Could you embed also tumor location on this graph? I am thinking of adding tumor_location colors to this graph

  3. Can I confirm the ALT status and age correlation was done with the refreshed cohort (same as used for recent circos and oncoprint plots)? https://github.com/d3b-center/hope-cohort-analysis/blob/rerun-analyses/analyses/alt-analysis/results/alt_status_chisq_output.tsv

  4. Just a heads up - Jo Lynne is working on the modified list of tumor locations to adopt from OpenPedCan to Hope cohort. That may require to create additional versions (not replacing) of few graphs (circos plot, oncoprint and ALT vs age with tumor location as in point 2).

komalsrathi commented 10 months ago

Sure - will work on this.

komalsrathi commented 10 months ago
  • with all the oncogenes on the list (not only top 20). That long version of figure will go to the supplementary data for the Hope manuscript.

Oncogenes from what source? Would the reference gene list provided in annoFuse be ok: https://github.com/d3b-center/annoFuseData/blob/master/inst/extdata/genelistreference.txt? I would filter by any rows that have Oncogene under the type column.

Clarifying so that there are no conflicts later on.

Could you embed also tumor location on this graph? I am thinking of adding tumor_location colors to this graph

Confused how you are envisioning this. Could you draw or explain as text and send it over?

Can I confirm the ALT status and age correlation was done with the refreshed cohort (same as used for recent circos and oncoprint plots)?

Yes everything has been updated.

cc: @jharenza if you have any inputs.

komalsrathi commented 10 months ago

Could you embed also tumor location on this graph?

Did you mean something like this? telomere_content_vs_age_two_groups_by_tumor_loc.pdf

komalsrathi commented 10 months ago

I was able to get this done with the help of @zzgeng. Few points to note here:

1) Version 1: The boxplot colors are not retained (shades of green in the original boxplot) and they are now black

telomere_content_vs_age_two_groups_by_tumor_loc.pdf

2) Version 2: The boxplot colors are retained and they are used for the points border as well

telomere_content_vs_age_two_groups_by_tumor_loc.pdf

jharenza commented 10 months ago

I was able to get this done with the help of @zzgeng. Few points to note here:

  1. Version 1: The boxplot colors are not retained (shades of green in the original boxplot) and they are now black

telomere_content_vs_age_two_groups_by_tumor_loc.pdf

  1. Version 2: The boxplot colors are retained and they are used for the points border as well

telomere_content_vs_age_two_groups_by_tumor_loc.pdf

@mkoptyra I thought what we wanted to do here was color the subtype, since the caveat to this is that all of the DHG will be in the older age group and are also ALT+, and we want to call that out. @komalsrathi for that, you can use the cancer_group_short. For the oncoprint subtype field, please use that (in v2) since it will be fewer groups.

komalsrathi commented 10 months ago

Notes from our meeting (11/29/23):

Oncoplots: 1) Liftover genes from PNOC003 to Gencode v39. 2) Generate oncoplots with full gene list (i.e. OpenPedCan + MMR genes + PNOC003) + top 20 genes only. 3) Add cancer_group_short and CNS_region to oncoplot top annotations

Data availability circos plots: 1) Create an additional version of circos plot (with continuous age) with CNS_region instead of HOPE_Tumor.Location.condensed.

cc: @jharenza @mkoptyra

Please add/edit where applicable.

mkoptyra commented 10 months ago

adding : Create an additional version of ALT teleomer content by two age groups with CNS_region instead of HOPE_Tumor.Location.condensed: https://github.com/d3b-center/hope-cohort-analysis/blob/rerun-analyses/analyses/alt-analysis/results/telomere_content_vs_age_two_groups_by_tumor_loc.pdf

komalsrathi commented 10 months ago

@mkoptyra please check and let me know if I missed anything: 1) Updated genes lists to hg38

2) Updated Cascade plots with 2 age groups + tumor only samples + all genes + CNS_region and Cancer_Group (i.e. cancer_group_short) annotations: https://github.com/d3b-center/hope-cohort-analysis/blob/rerun-analyses/analyses/oncoplots/results/cascade_plots_add_tumor_only/cascade_two_age_groups_allgenes.pdf

3) Updated Cascade plots with 2 age groups + tumor only samples + top 20 genes + CNS_region and Cancer_Group (i.e. cancer_group_short) annotations: https://github.com/d3b-center/hope-cohort-analysis/blob/rerun-analyses/analyses/oncoplots/results/cascade_plots_add_tumor_only/cascade_two_age_groups_top20genes.pdf

4) Telomere content vs two age groups colored by CNS_region: https://github.com/d3b-center/hope-cohort-analysis/blob/rerun-analyses/analyses/alt-analysis/results/telomere_content_vs_age_two_groups_by_cns_region.pdf

5) Data availability circos plot with CNS_region instead of Tumor location condensed: https://github.com/d3b-center/hope-cohort-analysis/blob/rerun-analyses/analyses/data-availability/results/hope_clinical_data_availability_age_continuous_cns_region.pdf

mkoptyra commented 9 months ago

Hi Komal; For the data availability circos plot with Tumor location: https://github.com/d3b-center/hope-cohort-analysis/blob/rerun-analyses/analyses/data-availability/results/hope_clinical_data_availability_age_continuous.pdf

  1. Is it possible to integrate the annotation between the rings - (here, below is modification I added in Photoshop with the the annotation on the rings, but ideally the annotations above/between the rings would be ideal) hope_clinical_data_availability_age_continuous-MOIDF

  2. the color of the outer age rink. is there a possibility to use a one color changing into another color? We got a feedback that the blue gradient in the peds ring section may be hard to notice any change. Wondering if you have any thoughts about this.

mkoptyra commented 9 months ago

Regarding the Updated Cascade plots with 2 age groups + tumor only samples + top 20 genes + CNS_region and Cancer_Group (i.e. cancer_group_short) annotations: https://github.com/d3b-center/hope-cohort-analysis/blob/rerun-analyses/analyses/oncoplots/results/cascade_plots_add_tumor_only/cascade_two_age_groups_orderby_age_top20genes.pdf

I made the following modifications in photoshop: cascade_two_age_groups_orderby_age_top20genes (1) copy These changes included the folloiwng:

Additional changes needed:

mkoptyra commented 9 months ago

For the Updated Cascade plots with 2 age groups + tumor only samples + all genes + CNS_region and Cancer_Group (i.e. cancer_group_short) annotations: https://github.com/d3b-center/hope-cohort-analysis/blob/rerun-analyses/analyses/oncoplots/results/cascade_plots_add_tumor_only/cascade_two_age_groups_allgenes.pdf

I made the following modifications in photoshop: cascade_two_age_groups_allgene-MODIF

These changes included the following:

Additional changes needed:

mkoptyra commented 9 months ago

For the Telomere content vs two age groups colored by Tumor_location: https://github.com/d3b-center/hope-cohort-analysis/blob/rerun-analyses/analyses/alt-analysis/results/telomere_content_vs_age_two_groups_by_tumor_loc.pdf

Is the p value adjusted with the tumor location? The reason for that question is the fact that midline tumors are mostly in the 0-15 age group so it may bias the ALT telomere content difference