AlexsLemonade / OpenPBTA-analysis

The analysis repository for the Open Pediatric Brain Tumor Atlas Project
Other
100 stars 67 forks source link

Updated analysis: Make default method for immune-deconv NOT CIBERSORT #931

Closed cansavvy closed 3 years ago

cansavvy commented 3 years ago

What analysis module should be updated and why?

CIBERSORT method is the default for immune-deconv but it doesn't actually run because we don't have the script for it. We haven't updated the results on Github in almost a year. See the immune-deconv README for more details on why the default method CIBERSORT can't be run.

I think if we want to encourage this to be ran more often and to be ran more easily, CIBERSORT should not be the default method. Instead the mcp-counter method should be made as default (That's what's actually tested in CI as well).

What changes need to be made? Please provide enough detail for another participant to make the update.

run-immune-deconv.sh's logic needs to be udpated as well as the README.

Perhaps 01-immune-deconv.R needs reordering as well or at the very least documentation updates.

What input data should be used? Which data were used in the version being updated?

v18 data

When do you expect the revised analysis will be completed?

TBD

Who will complete the updated analysis?

TBD

jharenza commented 3 years ago

@komalsrathi - thoughts? Would you be able to update this?

komalsrathi commented 3 years ago

@jharenza yes I'll address this as soon as I can.

komalsrathi commented 3 years ago

@cansavvy

We actually do have the CIBERSORT.R and LM22.txt files required to run Cibersort - I registered and downloaded those and that's how I generated the outputs corresponding to CIBERSORT. Those files are added to the .gitignore because we cannot share them with anybody publicly. Because of the aforementioned reason, we are using MCP-Counter for the CI tests. This issue has been fully discussed here: https://github.com/AlexsLemonade/OpenPBTA-analysis/pull/292 (you might have to go through this entire thread)

Although, it still needs to be updated with v18 so I can do that.

Please let me know your thoughts.

cc: @jaclyn-taroni @jharenza

jaclyn-taroni commented 3 years ago

@komalsrathi my read of this issue is changing the default method to be a method that everyone, regardless of whether or not they have registered and downloaded those files (and put them in the right place!), can run this module. I believe what gave rise to this issue was difficulty rerunning this using the v18 data. It seems reasonable to have a default that doesn't cause failure.

Although, it still needs to be updated with v18 so I can do that.

Agree that we should update with v18.

komalsrathi commented 3 years ago

How does this sound:

@cansavvy please assign it to me so this is on my radar. Thanks!

cansavvy commented 3 years ago

@komalsrathi, It seems like the code changes suggested above, would make it that CIBERSORT.R is no longer to be able to run at all. Is that what we want?

As @jaclyn-taroni mentioned above, my original issue was just about moving the default to not CIBERSORT, but I do not know enough about CIBERSORT.R whether we like to drop it as a method completely.

So, this plan suggested is fine if we really don't think we have a need to ever run CIBERSORT.R again (I just don't know enough to make this call).

komalsrathi commented 3 years ago

@cansavvy sorry my bad - I misunderstood.

I guess I can just remove these lines and explicitly use any methods (mcp_counter, cibersort or xcell) to be used in the parameter here so that the user is not forced to use cibersort if they don't have those files. I'll make those changes and file a PR. Thanks!

komalsrathi commented 3 years ago

@jaclyn-taroni @jharenza @cansavvy I am going to tackle this now.

I was going through the module code/figures and was wondering if we should question the approach of using multiple methods unless the correlation plots and cell types obtained from methods other than xCell add value to the manuscript.

At the time, I did not know much about various methods but now it seems to me that xCell is the most comprehensive deconvolution method (with the largest number of cell types) and pretty widely used in literature vs other deconvolution methods. From a manuscript perspective, would it make sense to stick to a single heatmap and interpretation based on xCell output only? Let me know what you think.

jharenza commented 3 years ago

xCell is the most comprehensive deconvolution method (with the largest number of cell types) and pretty widely used in literature vs other deconvolution methods. From a manuscript perspective, would it make sense to stick to a single heatmap and interpretation based on xCell output only? Let me know what you think.

I think this sounds like a justifiable reason that we can add to the methods. Since you had done the analyses with the others and computed correlations, perhaps we just perform any further plotting with xCell data?

jaclyn-taroni commented 3 years ago

Since you had done the analyses with the others and computed correlations, perhaps we just perform any further plotting with xCell data?

I agree, that's what we've done for figure generation so far:

I haven't dug back into this module's code, but my assumption is that it would be less work to switch to mcp-counter as the default method than ripping a bunch of things out. (This might be a bad or incorrect assumption.)

komalsrathi commented 3 years ago

I see, its good to know that you are only using xCell in the transcriptomic overview, I was suggesting the same i.e. get rid of other methods completely as they are not helpful in the interpretation.

I will change the script such that it will only:

  1. run xCell
  2. generate xCell based heatmap

So there won't be any data/plot associated with any other methods. I'll file a PR soon.

komalsrathi commented 3 years ago

Can be closed.