cortes-ciriano-lab / SComatic

A tool for detecting somatic variants in single cell data
Other
145 stars 20 forks source link

SitesPerCell.py issue #11

Closed kane9530 closed 1 year ago

kane9530 commented 1 year ago

Hi there,

I am trying to run the SitesPerCell.py script following the documentation in SComaticExample.md, and I am facing the following difficulties:

  1. It seems that the documentation for this script in is incomplete in the parameters that are supplied to the argument.
  2. When I run the script as such: python3 scripts/SitesPerCell/SitesPerCell.py --bam results/Sample.Epithelial_cells.bam --ref example_data/chr10.fa --infile basecallstep1.calling.step1.tsv

It appears to complete almost immediately and delete all the files in the current directory. Is this behaviour reproducible on your end?

Best, Kane

Francesc-Muyas commented 1 year ago

Dear user, Thanks for using SComatic. Could you briefly answer a few questions to diagnose the problem?

  1. Could you let me know if you managed to run the script with the example data?
  2. Could you tell me why you're using our small reference fasta file instead of the one you used for aligning your sample? The provided reference genome is a small fraction of the human genome - chr10 (Hg38 with "chr" prefix ).
  3. Did you check that both reference genomes (your alignment and our reference genome) are compatible and represent the exact same reference genomes?
  4. The documentation provided in SComaticExample.md show the example commands used to reproduce the results explained in the manuscript. These commands (recommended) assume some default parameters not shown there. You will find all parameters for this script here.
  5. Please be aware that, by default, this script only checks positions in the basecallstep1.calling.step1.tsv that had expression (or coverage) enough for at least two cell types. If this is not your case, please change the _--minct1 and _--minct2 parameters. Importantly, the values for these two parameters should match with parameters used in Step 4.1 (BaseCellCalling.step1.py): _--minct1 ~ _--min_celltypes and _--minct2 ~ _--min_celltypes . By default (and strongly recommended), this value is set to 2. However, if you don't have many cell types with good coverage, you can decrease this cutoff on expenses of much lower performance.

Thanks again for your interest, Fran

kane9530 commented 1 year ago

Dear Fran,

Thank you for your reply!

I was running the SitesPerCell.py script on the example data, and the reported error arose when trying out the example data. I realise that the error happened because I didn't set the temp directory properly (within the output_folder as shown in SComaticExample.md). When I specified the temp dir correctly, the problem is resolved.

Best wishes, Kane