raphael-group / chisel

CHISEL -- Copy-number Haplotype Inference in Single-cell by Evolutionary Links
BSD 3-Clause "New" or "Revised" License
37 stars 11 forks source link

Empty baf.tsv #13

Closed marialitovchenko closed 4 years ago

marialitovchenko commented 4 years ago

Hello Chisel developers,

Thank you for developing this tool! I'm very excited to try it out on my data. As described in the detailed tutorial, I created 4 input files:

Yet, chisel didn't complete it's procedure. While the error message displayed in log was exactly like in the other raised issue, namely AssertionError: There is a bin with a BAF shift > 0.5, likely BAF was not mirrored between 0 and 0.5, I believe that it's something different. Following your recommendations (from this issue) I checked that barcodes have only A,T,G,C in them as well as phased reference calls do not contain chrX or Y. I was able to trace the issue down to the empty baf.tsv file, despite rdr.tsv and total.tsv being not empty:

(heads only) ==> rdr/rdr.tsv <== chr1 0 5000000 AAAGCA 33436 1067 0.940321827993 chr1 0 5000000 AACCTT 33436 1000 1.04244536404 chr1 0 5000000 AACTCT 33436 603 0.9298229247 chr1 0 5000000 AATAGG 33436 525 0.993707584142 chr1 0 5000000 ACAAAC 33436 616 0.979847129724 chr1 0 5000000 ACATCT 33436 647 0.978664550356 chr1 0 5000000 ACCATT 33436 933 0.930353882567 chr1 0 5000000 ACTACT 33436 76 0.398769743582 chr1 0 5000000 ACTTAC 33436 1135 0.981891004791 chr1 0 5000000 ACTTGA 33436 1499 0.943007173708

==> rdr/total.tsv <== normal 24703207 CAATCT 911059 AGATAG 444040 TTGCTA 724013 CTTAGT 801381 GCAGAT 1052557 GTTCTA 722827 CCTATA 335797 GTTGTT 779656 TCTGTA 475231

Please find linked log file: chisel_log.txt and baf log: baf_log.txt

Since I though that there is something wrong with my data, I then I decided to try Chisel out on provided test data and surprisingly I run into exactly the same error. Here are log file and baf log: chisel_test.txt, baf_log_chisel_test.txt. Because I perform calculations on cluster, I have chisel available via singularity container, which can be downloaded with this link: singularity pull library://marialitovchenko/default/chisel:v.0.0.4 and example can be run like this: singularity exec chisel_v.0.0.4.sif chisel -t cells.bam -n normal.bam -r hg19.fa -l phases.tsv -j 4 1>chisel_test.out 2>chisel_test.err

Could you tell me please if it could be an installation issue and some libraries are missing?

simozacca commented 4 years ago

Thank you for your interest in CHISEL! I would be happy to help you with this issue.

There may be some error in the path/directories that have been used or some other error in the system, could you please try the following when considering the test data:

  1. First, could you please confirm that the baf.tsv file exists but it is empty or the file simply does not exist within the baf/ directory?

  2. Could you please cd into the running directory before executing CHISEL?

  3. Could you please try to exactly replicate the 4 lines in the quick start without a singularity? You should be able to place these lines in a bash script and simply run this into your cluster without any requirement. The quick start must work fine and then we can check differences between the two ways of running it.

Thanks

marialitovchenko commented 4 years ago

Thank you for your reply!

Yes, during all the runs baf.tsv was created, but was empty Then I cd demos/complete, created folder mydata and copied my files there, chisel completed successfully Then I cd demos/complete, and use full paths to my data stored elsewhere, chisel also completes computations However, if I run chisel from it's parent directory, despite the fact that I use full paths to my input files, AssertionError persists.

It's okay that one needs to cd to the chisel directory to run the tool, it just should be noted in the manual