Open jashapiro opened 3 years ago
Hey @hbeale, we are happy to rerun this but we wanted to ask you if you had an idea of what computational resources (e.g., RAM, cores) were required to do so. Thank you!
I'm going to try to get this running on AWS today. I'll try out 128 GB and see how that goes.
We're going to wait until after v19 (#867) because of #862 - removing a sample from the dataset might (slightly) change the results!
Thanks @cansavvy! If you need to ping us about the performance again, Ellen Kephardt is the most knowledgable about what resources are needed. I meant to ask her, and then forgot :)
Sounds good, thanks!
Noting v18
was re-run here https://github.com/AlexsLemonade/OpenPBTA-analysis/pull/892, but this has not been re-run with v19
as suggested. At the time of this comment, the current version is v21
.
What analysis module should be updated and why?
The addition of samples in v18 will require a rerun of the scripts in
comparative-RNASeq-analysis
to generate new data.What changes need to be made? Please provide enough detail for another participant to make the update.
No changes in code should be required (beyond changes already made in #892), but
results/rsem-tpm-stranded-gene_expression_outliers.tsv.gz
will end up needing an update.This will require running on a machine with >16GB memory. I am not sure the exact requirements, but my local machine was not sufficient.
The full analysis can be performed with the following command from within the OpenPBTA docker image:
bash analyses/comparative-RNASeq-analysis/run-comparative-RNAseq.sh
What input data should be used? Which data were used in the version being updated?
v18 expression matrixes
When do you expect the revised analysis will be completed?
After v18 release (or perhaps wait until after v19?)
Who will complete the updated analysis?
A CCDL member, most likely.