AlexsLemonade / OpenScPCA-analysis

An open, collaborative project to analyze data from the Single-cell Pediatric Cancer Atlas (ScPCA) Portal
Other
5 stars 14 forks source link

Initiation of SCPCP000001 #778

Open georginaalbadri opened 6 days ago

georginaalbadri commented 6 days ago

Please link to the GitHub Discussion for this proposed analysis.

https://github.com/AlexsLemonade/OpenScPCA-analysis/discussions/722

Describe the goals of this analysis module.

To use CellTypist to transfer GBM labels from the Core GBMap (https://www.biorxiv.org/content/10.1101/2022.08.27.505439v1 to the 16 paediatric GBM samples, validated by a combination of cell clustering methods and differential gene expression analysis.

What software will you require?

Python 3.9 with packages including scanpy, celltypist, gseapy, scipy, matplotlib, seaborn, numpy, pandas

What will your first pull request contain?

The analysis module skeleton created by running create-analysis-module.py and initial documentation in the README.md file

What computational resources will you require?

Only a simplified version of CellTypist, or pre-made models, can be used on a laptop, so this analysis will use AWS to generate a CellTypist model based on GBMap and do the label transfer, in order to use the most accurate form of CellTypist.

If known, when do you expect to file the first pull request?

24/09 or 25/09

jashapiro commented 6 days ago

Hi @georginaalbadri! Thank you for yoru pending contribution. I had one question based on your comments above:

Python 3.9 with packages including scanpy, celltypist, gseapy, scipy, matplotlib, seaborn, numpy, pandas

Is there a reason that you need Python 3.9? In general in this project we are trying to start with Python 3.11 as the standard version, partly because of some speed benefits that it brings, but also because it will be supported for longer than 3.9. While conda makes it easy to support multiple versions of Python for different modules, if it is possible to use 3.11 for this module as well, that might be preferred.

georginaalbadri commented 5 days ago

Hi @jashapiro, I was using CellTypist on 3.9 but looks like it supports 3.11 too so should be no problem to use 3.11 instead

georginaalbadri commented 3 days ago

Hi, I'm getting an error when trying to commit the analysis module to my forked version of the OpenSC repository. It looks like pre-commit has flagged a conflict between the x86_64 version of conda installed in it, and the fact that my laptop is Apple Silicon so requires arm64. Can I just make the module again without the --use-conda flag and then install conda manually? Or can I get pre-commit to ignore this conflict, if we want the module to have the x86_64 conda? Thanks!

jashapiro commented 3 days ago

Hi, I'm getting an error when trying to commit the analysis module to my forked version of the OpenSC repository. It looks like pre-commit has flagged a conflict between the x86_64 version of conda installed in it, and the fact that my laptop is Apple Silicon so requires arm64. Can I just make the module again without the --use-conda flag and then install conda manually? Or can I get pre-commit to ignore this conflict, if we want the module to have the x86_64 conda? Thanks!

Can you post the exact error you are getting? I don't think that pre-commit is likely to be flagging an error like that, but it may be thinking that some of the version hashes are "secrets". Did you perhaps create the conda file with conda env export? If so, you might try again with the --no-builds flag and see if that helps.

georginaalbadri commented 3 days ago

I see, on further inspection I think it is identifying a problem with conda on the laptop itself so I'll try reinstalling it. Hopefully won't take too long!