raphael-group / chisel

CHISEL -- Copy-number Haplotype Inference in Single-cell by Evolutionary Links
BSD 3-Clause "New" or "Revised" License
37 stars 11 forks source link

chisel_prep #31

Open pulimeng opened 10 months ago

pulimeng commented 10 months ago

Hi,

Thanks for the great work! I am working on some single cell copy number analysis problem. My data is not barcoded. It is .bam files of hundreds of individual cells. So I am using the chisel_prep to generate a barcoded input for the chisel_nonormal. However, the entire process is taking exceptional long time. It has been running for more than a week and only 10 cells/files have been processed. Is there some thing I did wrong? Or is there any options to speed up the process? This is the command I'm using

chisel_prep pathtofoldre/bams/*.bam -r ./refs/hg38.fa -o ./outputs/ -j 32

Thanks, LP

simozacca commented 10 months ago

Thank you for your email; in order to evaluate your issue, can you please provide the log of the command, as well as some details about the BAMs that you are working with (especially size, etc). Also can you please confirm that you running in a server in which you definetely have 32 cores fully available to you?

pulimeng commented 10 months ago

I do have all cores available and can be used. I don't know where to find the log since it never finished so I just terminated the script. And my data is single cell bam files with 10x coverage so each file is around 20Gb.

simozacca commented 10 months ago

CHISEL always outputs the entire log in standard output of your OS; see an example in the CHISEL capsule. So if you running within a computing cluster, you should look into the file in which stdout is saved, or if you are running from a single server you should save the log with either chisel [...] &> chisel.log or chisel [...] |& chisel.log. I am afraid the log is needed to be able to help.

pulimeng commented 9 months ago

Hmmm. I cannot locate the log files. Maybe I deleted them accidentally. Anyway, do you have an expected time for how long chisel_prep should run for say like 100 cells.

simozacca commented 9 months ago

For more than 2500 cells it should not take more 2 days on a standard 20 CPU server. Do you see a progress bar progressing? In any case, I would reccommend re-running and save the log so that it can be debugged.

pulimeng commented 9 months ago

Thanks for the info. I will rerun them. My previous experience is quite different. It took me one day to run 10 cells according to the progress bar. is there anything wrong with my command? And each single cell bam is around 10x coverage