KChen-lab / Monopogen

SNV calling from single cell sequencing
GNU General Public License v3.0
80 stars 17 forks source link

v1.5.0 Monopogen fails on example data in "Somatic SNV calling from scRNA-seq" #56

Closed briansha closed 4 months ago

briansha commented 5 months ago

Using v1.5.0 - Extracting the single cell info reads step fails on the Github example data in the documentation.

python  ${path}/src/Monopogen.py  somatic  \
    -a   ${path}/apps  -r  region.lst  -t 1 \
    -i  bm  -l  CB_7K.maester_scRNA.csv   -s featureInfo     \
    -g   GRCh38.chr20.fa

Unclear why. In the log, just says "it failed".

[2024-05-07 22:24:23,250] INFO     Monopogen.py Get feature information from sequencing data...
bash /opt/tools/bm/Script/bamExtract_chr20.sh
[2024-05-07 22:24:49,879] ERROR    Monopogen.py In LDrefinement step chr20 failed!
[2024-05-07 22:24:49,879] ERROR    Monopogen.py Failed! See instructions above.

I've had this example data run through just fine for v1.0.0 - which is slow as molasses - but v1.5.0 and v1.6.0 which I'd rather run are just proving problematic.

jinzhuangdou commented 5 months ago

There is a big change on somatic calling for v1.5.0.
2/26/2024: Version 1.5.0 released. In the cell-scan step, we implemented a motif-based search on wild/mutated alleles for all cells from the bam file directly. The single-cell level bam file splitting and joint calling modules were removed. This new version achieves over 10-fold speed up than the old version due to avoid the bam splitting. It could take less than 60 mins to collect the wild/mutated allele profiles of 10K cells over 20K loci.

jinzhuangdou commented 5 months ago

Have you ever ran the cellscan step in version1.5.0 on somatic calling? It seems this step is missed in your log file?

briansha commented 5 months ago

Following the exact steps contained in the Documentation for the "Somatic SNV calling from scRNA-seq" when clicking on the link in the Table of Contents.

v1.0 has always completed, no problem. v1.5.0 and v1.6.0 do not.

Do those versions have some different steps I should follow - and the documentation is out of date for those versions?

jinzhuangdou commented 5 months ago

Have you run this step? I did not see the running records in the log files

python ${path}/src/Monopogen.py somatic \ -a ${path}/apps -r region.lst -t 1 \ -i bm -l CB_7K.maester_scRNA.csv -s cellScan \ -g GRCh38.chr20.fa