AlexsLemonade / OpenPBTA-analysis

The analysis repository for the Open Pediatric Brain Tumor Atlas Project
Other
99 stars 66 forks source link

UMAPs highlighting sequencing center #1617

Closed sjspielman closed 1 year ago

sjspielman commented 1 year ago

Closes #1616 Related to https://github.com/AlexsLemonade/OpenPBTA-manuscript/issues/366

This PR begins the process of exploring batch effects that may arise from samples being processed at different sequencing center. To this end, I created a script (that for now makes panels for figure S6, but this not necessarily set in stone!) to make three UMAPs, and we can discuss whether all/some/none of these are helpful.

  1. Colored by sequencing center and shaped by broad histology, and considers ALL histologies
  2. Colored by sequencing center and shaped by broad histology, but considers ONLY histologies processed at >1 center
  3. Colored by sequencing center with no indications for histology, and considers ALL histologies

The resulting UMAPs do not show very strong clustering by sequencing center, but it's a bit tricky to see some of the triangle and square points. If we want to use these plots, it might be helpful to code them out rather than using the plot_dimension_reduction() wrapper function, or we can add some arrows pointing to triangles/squares in Illustrator (or ggplot2 if feeling extra fancy).

The main questions for reviewers are as follows:

sjspielman commented 1 year ago

One initial thought I had is that this might make more sense to start off as an analysis notebook, rather than as a "final" figure script, until we decide what the results we want to present really are.

:100: good call. I think an exploratory notebook in transcriptomic-dimension-reduction is appropriate/good enough for all of this.

sjspielman commented 1 year ago

@jashapiro Thanks for the feedback, this has been updated as a notebook! I also modified the plotting approach overall to one I think is easier to see. HTML preview is here: https://htmlpreview.github.io/?https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/034a40a4e90af380e6e950897154948abadf9d48/analyses/transcriptomic-dimension-reduction/04-explore-sequencing-center-effects.nb.html

I ended up also including polyA data in the tables, but I'm not sure that I needed to. I figured can't hurt for an exploratory notebook anyways.

sjspielman commented 1 year ago

Figures have been updated with properly rendered legends - 04-explore-sequencing-center-effects.nb.html.zip

jaclyn-taroni commented 1 year ago

@jashapiro giving this a bump

sjspielman commented 1 year ago

I've updated this with a conclusion and a UMAP with arrows pointing to potential "areas of caveat." It was also re-run with V23 and there were no changes as expected!

04-explore-sequencing-center-effects.nb.html.zip