Open hafizmtalha opened 1 year ago
Can you post your meta.csv and vars.py files? They help to understand where the error stands. Also, you can check the content of the fastqc_data.txt file if some error log was written.
Enrico
Il Mer 10 Ago 2022, 20:50 Hafiz Muhammad Talha @.***> ha scritto:
echo "No reads in /dell/muscle/sample9_2.fastq.gz" > samples/sample9/read_statistics/fastqc_stats/sample9_2_fastqc.html && echo "No reads in /dell/muscle/sample9_2.fastq.gz" > samples/sample9/read_statistics/fastqc_stats/sample9_2_fastqc/fastqc_data.txt && fastqc /dell/muscle/sample9_2.fastq.gz -o samples/sample9/read_statistics/fastqc_stats --extract > samples/sample9/read_statistics/fastqc_stats/sample9_2.fastq_fastqc.log 2> samples/sample9/read_statistics/fastqc_stats/sample9_2.fastq_fastqc.err get_stringtie_rawcounts.R -g samples/sample9/processings/stringtie/sample9_transcripts.gtf -f /dell/circompara2/test_circompara/analysis/samples/sample9/read_statistics/fastqc_stats/sample9_1_fastqc/fastqc_data.txt,/dell/circompara2/test_circompara/analysis/samples/sample9/read_statistics/fastqc_stats/sample9_2_fastqc/fastqcdata.txt -o samples/sample9/processings/stringtie/sample9 Error in strsplit(grep("Sequence length", x = fastqc_data.txt, value = T), : subscript out of bounds Calls: mean -> sapply -> lapply -> FUN -> mean -> strsplit Execution halted scons: *** [samples/sample9/processings/stringtie/sample9_gene_expression_rawcounts.csv] Error 1 scons: building terminated because of errors.
How to solve this ?
— Reply to this email directly, view it on GitHub https://github.com/egaffo/circompara2/issues/10, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADGPTU6POIM44PWYBM4KSEDVYP2V7ANCNFSM56FSBHMQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>
the file fastqc_data.txt No reads in /dell/muscle/sample9_2.fastq.gz
META = 'meta.csv' GENOME_FASTA = '/dell/new3TB/mousegenome/Mus_musculus.GRCm39.dna.primary_assembly.fa' ANNOTATION = '/dell/new3TB/mousegenome/Mus_musculus.GRCm39.107.gtf' CPUS = '20'
SEGEMEHL_INDEX = "/dell/circompara2/test_circompara/analysis/dbs/indexes/indexes/segemehl/Mus_musculus.GRCm39.dna.primary_assembly.idx" BWA_INDEX = "/dell/circompara2/test_circompara/analysis/dbs/indexes/indexes/bwa/Mus_musculus.GRCm39.dna.primary_assembly" BOWTIE2_INDEX = "/dell/circompara2/test_circompara/analysis/dbs/indexes/indexes/bowtie2/Mus_musculus.GRCm39.dna.primary_assembly"
STAR_INDEX = "/dell/circompara2/test_circompara/analysis/dbs/indexes/indexes/star/Mus_musculus.GRCm39.dna.primary_assembly/" GENEPRED = "/dell/circompara2/test_circompara/analysis/dbs/indexes/Mus_musculus.GRCm39.107.genePred.wgn"
REST OF THE THINGS WERE COMMENTED AS IN DEFAULT FILE
How about the meta.csv?
Did you check the fastq file is not empty?
Also, check if some error were reported in the read_statistics/fastqc_stats/*_fastqc.log
andread_statistics/fastqc_stats/*_fastqc.err
files.
Which version of circompara2 are you using, and is it a custom installation or the Docker container? Did it work with the test data?
meta.csv
file,sample,condition /dell/muscle/sample6_1.fastq.gz,S6,WT /dell/muscle/sample6_2.fastq.gz,S6,WT /dell/muscle/sample7_1.fastq.gz,S7,WT /dell/muscle/sample7_2.fastq.gz,S7,WT /dell/muscle/sample8_1.fastq.gz,S8,WT /dell/muscle/sample8_2.fastq.gz,S8,WT /dell/muscle/sample9_1.fastq.gz,S9,WT /dell/muscle/sample9_2.fastq.gz,S9,WT /dell/muscle/sample10_1.fastq.gz,S10,WT /dell/muscle/sample10_2.fastq.gz,S10,WT
File is not empty and mappers worked fine on this..!!
read_statistics/fastqc_stats/*_fastqc.log
is empty, nothing in it
read_statistics/fastqc_stats/*_fastqc.err
has an error
uk.ac.babraham.FastQC.Sequence.SequenceFormatException: Midline '@GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGF' didn't start with '+'
at uk.ac.babraham.FastQC.Sequence.FastQFile.readNext(FastQFile.java:172)
at uk.ac.babraham.FastQC.Sequence.FastQFile.next(FastQFile.java:125)
at uk.ac.babraham.FastQC.Analysis.AnalysisRunner.run(AnalysisRunner.java:77)
samples/SRR9302709/read_statistics/fastqc_stats/SRR9302709_1.fastq_fastqc.err
I cloned this repository and then installed it.. Test run was succesful.
The error tells your fastq file is not well formatted, therefore FASTQC fails. Check the consistency of your input files, they must be properly formatted as FASTQ
that's quite strange because I checked head n tail of both files for this sample and they look fine.. Is there any way I could skip preprocessing steps of all other samples and run only this sample until this same point ??
You can remove that sample from the meta.csv and run circompara2 on the same project directory. Circompara2 will just skip that sample without reprocessing tasks already done in your previous run. I suggest you make a new project dir with only the "corrupt" file and make your tests. Then, when everything will be ok either you merge the two project results "by hand", or add again the fixed input file in the meta.csv and run circompara2 again to let it update the final result files.
Thanks for the help.. will try that
echo "No reads in /dell/muscle/sample9_2.fastq.gz" > samples/sample9/read_statistics/fastqc_stats/sample9_2_fastqc.html && echo "No reads in /dell/muscle/sample9_2.fastq.gz" > samples/sample9/read_statistics/fastqc_stats/sample9_2_fastqc/fastqc_data.txt && fastqc /dell/muscle/sample9_2.fastq.gz -o samples/sample9/read_statistics/fastqc_stats --extract > samples/sample9/read_statistics/fastqc_stats/sample9_2.fastq_fastqc.log 2> samples/sample9/read_statistics/fastqc_stats/sample9_2.fastq_fastqc.err get_stringtie_rawcounts.R -g samples/sample9/processings/stringtie/sample9_transcripts.gtf -f /dell/circompara2/test_circompara/analysis/samples/sample9/read_statistics/fastqc_stats/sample9_1_fastqc/fastqc_data.txt,/dell/circompara2/test_circompara/analysis/samples/sample9/read_statistics/fastqc_stats/sample9_2_fastqc/fastqcdata.txt -o samples/sample9/processings/stringtie/sample9 Error in strsplit(grep("Sequence length", x = fastqc_data.txt, value = T), : subscript out of bounds Calls: mean -> sapply -> lapply -> FUN -> mean -> strsplit Execution halted scons: *** [samples/sample9/processings/stringtie/sample9_gene_expression_rawcounts.csv] Error 1 scons: building terminated because of errors.
How to solve this ?
Hi, Have you sloved this problem?
echo "No reads in /dell/muscle/sample9_2.fastq.gz" > samples/sample9/read_statistics/fastqc_stats/sample9_2_fastqc.html && echo "No reads in /dell/muscle/sample9_2.fastq.gz" > samples/sample9/read_statistics/fastqc_stats/sample9_2_fastqc/fastqc_data.txt && fastqc /dell/muscle/sample9_2.fastq.gz -o samples/sample9/read_statistics/fastqc_stats --extract > samples/sample9/read_statistics/fastqc_stats/sample9_2.fastq_fastqc.log 2> samples/sample9/read_statistics/fastqc_stats/sample9_2.fastq_fastqc.err get_stringtie_rawcounts.R -g samples/sample9/processings/stringtie/sample9_transcripts.gtf -f /dell/circompara2/test_circompara/analysis/samples/sample9/read_statistics/fastqc_stats/sample9_1_fastqc/fastqc_data.txt,/dell/circompara2/test_circompara/analysis/samples/sample9/read_statistics/fastqc_stats/sample9_2_fastqc/fastqcdata.txt -o samples/sample9/processings/stringtie/sample9 Error in strsplit(grep("Sequence length", x = fastqc_data.txt, value = T), : subscript out of bounds Calls: mean -> sapply -> lapply -> FUN -> mean -> strsplit Execution halted scons: *** [samples/sample9/processings/stringtie/sample9_gene_expression_rawcounts.csv] Error 1 scons: building terminated because of errors.
How to solve this ?