dan-landau / IronThrone-GoT

processes GoT amplicon data and generates a table of metrics
26 stars 6 forks source link

Empty summTable.txt #15

Open GBeattie opened 2 years ago

GBeattie commented 2 years ago

Hey,

Unfortunately we're having trouble running the GoT pipeline, the script runs with a series of uninitialized value warnings, and produces the output files, however there's only one entry in the output summTable.txt. I've included the output should that be of use for diagnosing the issue.

CZ390.zip

A note on our .fastq files, they're from a MiSeq 300 run, so there's paired 151bp reads, the way the run was indexed the reads went into the "Undetermined" outputs, these have been filtered for 151bp reads (since undetermined reads includes non-full length reads), followed by some repairing to restore the paired-endedness. There's a possibility the issue may stem from this, however in principle they should be OK after those fixes.

I've pasted the command, config and command line outputs below, any insights as to what might be going wrong would be greatly appreciated!

This is the command:

./IronThrone-GoT --run circ --config got_dnmt3a_v2.config --fastqR1=./fastqs/Undetermined_filtered_rep_S0_L001_R1_001.fastq --fastqR2=./fastqs/Undetermined_filtered_rep_S0_L001_R2_001.fastq --outdir CZ390 --sample CZ390 --umilen 12 --whitelist 3M-february-2018.txt.gz

And this is the config file:

TCCGCAGCGTCACACAGAAG    1   20
CATATCCAGGAGTGG 21  35
GG  36  37
CG  36  37
ATCGTCAACCCTGCTCGCAA    84  103
ATCGTCAACCCTGCTCGCAA    84  103
CAAAACGTACTGGTGCTTTCCTGT    124 147
CAAAACGTACTGGTGCTTTCCTGT    124 147

Finally, here is the console outputs:

[Fri Aug  5 16:10:10 2022] [CZ390/CZ390.SEQ_Qnum] created
[Fri Aug  5 16:10:19 2022] File read: CZ390/CZ390.BARCODE (size: 2477885,   lines: 145758)
[Fri Aug  5 16:10:19 2022] File read: CZ390/CZ390.BARCODE_Qnum (size: 7142142,   lines: 145758)
[Fri Aug  5 16:10:20 2022] File read: CZ390/CZ390.UMI (size: 1894853,   lines: 145758)
[Fri Aug  5 16:10:20 2022] File read: CZ390/CZ390.SEQ (size: 22155215,   lines: 145758)
[Fri Aug  5 16:10:20 2022] File read: CZ390/CZ390.SEQ_Qnum (size: 176221422,   lines: 145758)
[Fri Aug  5 16:10:21 2022] File read: CZ390/CZ390.SEQ_Q.P_errors_SUM (size: 2113508,   lines: 145758)
[Fri Aug  5 16:10:22 2022] Matrix file saved: CZ390/CZ390.SEQUENCE (145758 rows x 6 cols)
[Fri Aug  5 16:10:22 2022] File read: CZ390/CZ390.SEQUENCE (size: 212005028,   lines: 145758)
[Fri Aug  5 16:10:53 2022] File read: 3M-february-2018.txt.gz (size: 18350152,   lines: 89334)
[Fri Aug  5 16:10:53 2022] Thread 1: data len(41)  from(0)  to(40)
[Fri Aug  5 16:10:53 2022] Thread 2: data len(41)  from(41)  to(81)
[Fri Aug  5 16:10:53 2022] Thread 3: data len(41)  from(82)  to(122)
[Fri Aug  5 16:10:53 2022] Thread 4: data len(32)  from(123)  to(154)
Use of uninitialized value in concatenation (.) or string at ./IronThrone-GoT line 1667.
Use of uninitialized value in numeric le (<=) at ./IronThrone-GoT line 1680.
Use of uninitialized value in numeric le (<=) at ./IronThrone-GoT line 1688.
Use of uninitialized value in numeric le (<=) at ./IronThrone-GoT line 1810.
Use of uninitialized value in numeric le (<=) at ./IronThrone-GoT line 1816.
Use of uninitialized value in addition (+) at ./IronThrone-GoT line 1396.
Use of uninitialized value in string eq at ./IronThrone-GoT line 1168.
Use of uninitialized value $find_value in string eq at ./IronThrone-GoT line 1168.
Use of uninitialized value in numeric le (<=) at ./IronThrone-GoT line 1703.
Use of uninitialized value in numeric le (<=) at ./IronThrone-GoT line 1711.
Use of uninitialized value in numeric le (<=) at ./IronThrone-GoT line 1726.
Use of uninitialized value in numeric le (<=) at ./IronThrone-GoT line 1734.
Use of uninitialized value in join or string at ./IronThrone-GoT line 609.
Use of uninitialized value in join or string at ./IronThrone-GoT line 609.
Use of uninitialized value in join or string at ./IronThrone-GoT line 609.
Use of uninitialized value in join or string at ./IronThrone-GoT line 609.
Use of uninitialized value in join or string at ./IronThrone-GoT line 609.
Use of uninitialized value in join or string at ./IronThrone-GoT line 609.
Use of uninitialized value in join or string at ./IronThrone-GoT line 609.
Use of uninitialized value in join or string at ./IronThrone-GoT line 609.
Use of uninitialized value in join or string at ./IronThrone-GoT line 609.
Use of uninitialized value in join or string at ./IronThrone-GoT line 609.
Use of uninitialized value in join or string at ./IronThrone-GoT line 609.
Use of uninitialized value in join or string at ./IronThrone-GoT line 609.
Use of uninitialized value in join or string at ./IronThrone-GoT line 609.
Use of uninitialized value in join or string at ./IronThrone-GoT line 609.
Use of uninitialized value in join or string at ./IronThrone-GoT line 609.
[Fri Aug  5 16:10:54 2022] Matrix file saved: CZ390/CZ390.summTable.txt (1 rows x 23 cols)
Zala-M commented 7 months ago

Hello @GBeattie,

I have the same issue as you. Did you find a solution? It's strange because I've tried it on two different fastq files, on one the output is correct, on the other it's empty (although my sequences are there, I've checked manually)

Thanks a lot!