chung-lab / SCAFE

Single Cell Analysis of Five'Ends
MIT License
45 stars 11 forks source link

scafe.workflow.sc.solo Quitting: Auto detection of TS oligo failed. Average softclip length is ambiguous. #23

Open marsa04 opened 1 year ago

marsa04 commented 1 year ago

Dear SCAFE team,

I ran into a very unusual error. For one sample, fails with an error:

[2023-05-23 07:11] TSO error_rate=0.14030

Quitting: Auto detection of TS oligo failed. Average softclip length is ambiguous.

========================================

Results of bam_to_ctss not found. Quitting.

========================================

Died at /SCAFE/scripts/scafe.workflow.sc.solo line 595.

Could you suggest how to solve this problem? cellranger with fastq of this sample works correctly and all downstream analysis of the gene count matrix of this sample does not cause problems. I run on 18 samples that were sequenced under the same conditions and only one of them gives this error.

Many Thanks, Marat

ktpolanski commented 1 year ago

I just ran into the same thing, some number of samples into a batch:

[2023-08-07 23:15] Checking all SCAFE executables
[2023-08-07 23:15] Checking: tabix version: 1.9
[2023-08-07 23:15] Checking: bgzip version: 1.9
[2023-08-07 23:15] Checking: bedtools version: 2.30.0
[2023-08-07 23:15] Checking: samtools version: Version: 1.3.1
[2023-08-07 23:15] Checking: paraclu found.
[2023-08-07 23:15] Checking: paraclu-cut found.
[2023-08-07 23:15] Checking: bedGraphToBigWig version: 2.8
[2023-08-07 23:15] Checking: bigWigAverageOverBed version: 2
[2023-08-07 23:15] Checking TSO presence in bam
[2023-08-07 23:15] Determining subsampe fraction
[2023-08-07 23:15] total_num_read=676164564. targeting 100000 read.
[2023-08-07 23:15] frac_subsample is set as 0.01
[2023-08-07 23:15] 10000 read sampled
[2023-08-07 23:16] 20000 read sampled
[2023-08-07 23:16] 30000 read sampled
[2023-08-07 23:16] 40000 read sampled
[2023-08-07 23:16] 50000 read sampled
[2023-08-07 23:16] 60000 read sampled
[2023-08-07 23:16] 70000 read sampled
[2023-08-07 23:16] 80000 read sampled
[2023-08-07 23:16] 90000 read sampled
[2023-08-07 23:17] 100000 read sampled
[2023-08-07 23:17] average CB length=16.00000
[2023-08-07 23:17] average UMI length=10.00000
[2023-08-07 23:17] average softclip length=9.08676
[2023-08-07 23:17] 1st ranked softclip length=14 [16.80800%]
[2023-08-07 23:17] 2nd ranked softclip length=13 [16.38400%]
[2023-08-07 23:17] 3rd ranked softclip length=9 [10.82200%]
[2023-08-07 23:17] calculating error rate with offset_check_start_pos = 8
[2023-08-07 23:17] TSO error_rate=0.05905
Quitting: Auto detection of TS oligo failed. Average softclip length is ambiguous.

Setting --detect_TS_oligo manually allows the analysis to progress, I went with match because the TSO error was low and this is what got auto-inferred for the other samples, but I'm not sure this is the correct thing for me to be doing. Some assistance please @chung-lab ? Thanks and sorry for the trouble

marsa04 commented 7 months ago

Using match parameter for --detect_TS_oligo solves this problem for me too!