Katsevich-Lab / sceptre

An R package for single-cell CRISPR screen data analysis emphasizing statistical rigor, massive scalability, and ease of use.
https://katsevich-lab.github.io/sceptre/
GNU General Public License v3.0
26 stars 8 forks source link

Is faster version high-moi sceptre out yet? #36

Closed redbybeing closed 1 year ago

redbybeing commented 1 year ago

Hello!

Just wondering when the faster version high-moi sceptre will be out.

I think high-moi sceptre is working well for my negative and positive control pairs, and I am excited to run my candidate 1-2 million pairs.

Thanks, Jiseok

ekatsevi commented 1 year ago

I think high-moi sceptre is working well for my negative and positive control pairs, and I am excited to run my candidate 1-2 million pairs.

That's great to hear!

Just wondering when the faster version high-moi sceptre will be out.

We plan to release the improved high-MOI functionality by mid-July. For context, preliminary benchmarking on 40,000 of the Gasperini cells suggests that the speed will be increased from roughly 2-3 perturbation-gene pairs / second to 45-60 pairs / second. Assuming your data also have roughly 40,000 cells, then analyzing 1-2 million pairs would take you on the order of a week using the existing software. Using the new software, your computation could complete within half a day of so. So your two options are this stage are the following:

  1. Launch the computation right away using the existing software, and let it run for a week or so.
  2. Wait a few weeks for us to come out with the faster software, and then run it for half a day or so.
redbybeing commented 1 year ago

Thanks! Now I have ~70,000 cells, so I guess it can take around two weeks to run with current high-moi sceptre. I'll see if our computing system can do that. If it crashes then I will wait for the faster version.

redbybeing commented 1 year ago

Hi Eugene,

I have a follow-up question.

Our computing core allows only up to 11 days of consecutive running, unless I submit a special request, and they asked me if there's a way of reducing run time first, like parallelizing the job.

So for example if I have 2 million candidate pairs to run, can I split the job into ten and run 200k pairs per job, in parallel? Will that give the same results as running the 2 million pairs in one job? I'm not sure if SCEPTRE is designed to work that way.

Thanks, Jiseok

ekatsevi commented 1 year ago

So for example if I have 2 million candidate pairs to run, can I split the job into ten and run 200k pairs per job, in parallel? Will that give the same results as running the 2 million pairs in one job? I'm not sure if SCEPTRE is designed to work that way.

Parallelizing across pairs is a great idea! SCEPTRE will give you the same results no matter how you split the pairs.

redbybeing commented 1 year ago

Oh that's great! Then for each split job, should I include the same negative and positive control pairs? Or no need for that? (Run negative and positive control pairs in just one job and use that as reference for all other jobs).

ekatsevi commented 1 year ago

Please see this answer regarding the relationship between the control pairs and the candidate pairs. Ideally you would check the calibration of SCEPTRE based on your negative control pairs before moving on to analyzing your candidate pairs. If you want, you can also run your control pairs and your candidate pairs in parallel to each other. There is no need to include your control pairs together with your candidate pairs during parallelization.

I recommend you parallelize your SCEPTRE jobs by splitting the genes. So if you have 30,000 genes and you want to have 10 parallel jobs, then each job would contain all pairs involving 3,000 genes and all of your gRNA groups. Choosing this grouping will accelerate SCEPTRE because of the way it recycles computation across genes.

redbybeing commented 1 year ago

Thank you so much for your help Eugene!

ekatsevi commented 1 year ago

Hi Jiseok, How is your analysis going? I wanted to let you know that an early version of the improved high-MOI functionality is now available. Whether for your current application or any future ones, please keep it in mind.

redbybeing commented 1 year ago

Wow! Thanks for letting me know! Parallelization with the previous high-moi sceptre ran without a problem- But I will definitely try the updated version to test more conditions and samples soon! Thank you!

ekatsevi commented 1 year ago

Parallelization with the previous high-moi sceptre ran without a problem

Excellent!

But I will definitely try the updated version to test more conditions and samples soon! Thank you!

Sounds great!

ekatsevi commented 1 year ago

Closing this issue.