Open Kingatsu opened 2 years ago
The "WARNING: File samples/sample_A/processings/circRNAs/star_out/Chimeric.out.junction is empty!" suggests something went wrong with STAR alignment. Please, check the STAR version you run is the same required by circompara2.
The "WARNING: File samples/sample_A/processings/circRNAs/star_out/Chimeric.out.junction is empty!" suggests something went wrong with STAR alignment. Please, check the STAR version you run is the same required by circompara2.
Thanks for your help, I find a 'STAR-2.6.1e' folder in circompara/tools, but I code STAR -h
in my conda env is 'STAR version=2.7.9a', so should I downgrade the STAR to 2.6.1e in my conda env?
The "WARNING: File samples/sample_A/processings/circRNAs/star_out/Chimeric.out.junction is empty!" suggests something went wrong with STAR alignment. Please, check the STAR version you run is the same required by circompara2.
I've checked circompara2/bin/STAR --version
is 2.6.1e, and it is the same version in the 'install_tools.py'. And it report the same error in another python2.7 conda env which without conda install STAR
when I running the test.
Hello @egaffo ,I solved the STAR problem when I ran it in / , but now it has a new problem in the test running. Here is the error:
gene_annotation.R -c circular_expression/circrna_collection/combined_circrnas.gtf.gz -o circular_expression/circrna_collection/circrna_gene_annotation
Error in get(genname, envir = envir) : object 'testthat_print' not found
stringtie -p 24 -o samples/sample_A/processings/stringtie/sample_A_transcripts.gtf -A samples/sample_A/processings/stringtie/sample_A_gene_abund.tab -l sample_A -G /BIGDATA2/sysu_cyq_1/zhushunxin/CircRNA/circompara2/test_circompara/annotation/CFLAR_HIPK3.gtf -C samples/sample_A/processings/stringtie/sample_A_cov_refs.gtf -b samples/sample_A/processings/stringtie/ballgown_ctabs -e samples/sample_A/processings/hisat2_out/sample_A_hisat2.bam
stringtie -p 24 -o samples/sample_B/processings/stringtie/sample_B_transcripts.gtf -A samples/sample_B/processings/stringtie/sample_B_gene_abund.tab -l sample_B -G /BIGDATA2/sysu_cyq_1/zhushunxin/CircRNA/circompara2/test_circompara/annotation/CFLAR_HIPK3.gtf -C samples/sample_B/processings/stringtie/sample_B_cov_refs.gtf -b samples/sample_B/processings/stringtie/ballgown_ctabs -e samples/sample_B/processings/hisat2_out/sample_B_hisat2.bam
writeLines(["linear_expression/linear_quantexp_stringtie/geneexp/samples_expression_files.txt"], ["samples/sample_A/processings/stringtie/sample_A_gene_abund.tab", "samples/sample_B/processings/stringtie/sample_B_gene_abund.tab"])
writeLines(["linear_expression/linear_quantexp_stringtie/geneexp/samples_trxexp_files.txt"], ["samples/sample_A/processings/stringtie/sample_A_transcripts.gtf", "samples/sample_B/processings/stringtie/sample_B_transcripts.gtf"])
echo "No reads in /BIGDATA2/sysu_cyq_1/zhushunxin/CircRNA/circompara2/test_circompara/reads/readsB_1.fastq.gz" > samples/sample_B/read_statistics/fastqc_stats/readsB_1_fastqc.html && echo "No reads in /BIGDATA2/sysu_cyq_1/zhushunxin/CircRNA/circompara2/test_circompara/reads/readsB_1.fastq.gz" > samples/sample_B/read_statistics/fastqc_stats/readsB_1_fastqc/fastqc_data.txt && fastqc /BIGDATA2/sysu_cyq_1/zhushunxin/CircRNA/circompara2/test_circompara/reads/readsB_1.fastq.gz -o samples/sample_B/read_statistics/fastqc_stats --extract > samples/sample_B/read_statistics/fastqc_stats/readsB_1.fastq_fastqc.log 2> samples/sample_B/read_statistics/fastqc_stats/readsB_1.fastq_fastqc.err
echo "No reads in /BIGDATA2/sysu_cyq_1/zhushunxin/CircRNA/circompara2/test_circompara/reads/readsB_2.fastq.gz" > samples/sample_B/read_statistics/fastqc_stats/readsB_2_fastqc.html && echo "No reads in /BIGDATA2/sysu_cyq_1/zhushunxin/CircRNA/circompara2/test_circompara/reads/readsB_2.fastq.gz" > samples/sample_B/read_statistics/fastqc_stats/readsB_2_fastqc/fastqc_data.txt && fastqc /BIGDATA2/sysu_cyq_1/zhushunxin/CircRNA/circompara2/test_circompara/reads/readsB_2.fastq.gz -o samples/sample_B/read_statistics/fastqc_stats --extract > samples/sample_B/read_statistics/fastqc_stats/readsB_2.fastq_fastqc.log 2> samples/sample_B/read_statistics/fastqc_stats/readsB_2.fastq_fastqc.err
get_stringtie_rawcounts.R -g samples/sample_B/processings/stringtie/sample_B_transcripts.gtf -f /BIGDATA2/sysu_cyq_1/zhushunxin/CircRNA/circompara2/test_circompara/analysis/samples/sample_B/read_statistics/fastqc_stats/readsB_1_fastqc/fastqc_data.txt,/BIGDATA2/sysu_cyq_1/zhushunxin/CircRNA/circompara2/test_circompara/analysis/samples/sample_B/read_statistics/fastqc_stats/readsB_2_fastqc/fastqc_data.txt -o samples/sample_B/processings/stringtie/sample_B_
Error in strsplit(grep("Sequence length", x = fastqc_data.txt, value = T), :
subscript out of bounds
Calls: mean -> sapply -> lapply -> FUN -> mean -> strsplit
Execution halted
scons: *** [samples/sample_B/processings/stringtie/sample_B_gene_expression_rawcounts.csv] Error 1
scons: building terminated because of errors.
The R version is 3.6.3, I am looking forward to your reply.
The error Error in get(genname, envir = envir) : object 'testthat_print' not found could be the culprit. Please, check this thread https://github.com/r-lib/rlang/issues/1112
Hello @egaffo ,I solved the STAR problem when I ran it in / , but now it has a new problem in the test running. Here is the error:
gene_annotation.R -c circular_expression/circrna_collection/combined_circrnas.gtf.gz -o circular_expression/circrna_collection/circrna_gene_annotation Error in get(genname, envir = envir) : object 'testthat_print' not found stringtie -p 24 -o samples/sample_A/processings/stringtie/sample_A_transcripts.gtf -A samples/sample_A/processings/stringtie/sample_A_gene_abund.tab -l sample_A -G /BIGDATA2/sysu_cyq_1/zhushunxin/CircRNA/circompara2/test_circompara/annotation/CFLAR_HIPK3.gtf -C samples/sample_A/processings/stringtie/sample_A_cov_refs.gtf -b samples/sample_A/processings/stringtie/ballgown_ctabs -e samples/sample_A/processings/hisat2_out/sample_A_hisat2.bam stringtie -p 24 -o samples/sample_B/processings/stringtie/sample_B_transcripts.gtf -A samples/sample_B/processings/stringtie/sample_B_gene_abund.tab -l sample_B -G /BIGDATA2/sysu_cyq_1/zhushunxin/CircRNA/circompara2/test_circompara/annotation/CFLAR_HIPK3.gtf -C samples/sample_B/processings/stringtie/sample_B_cov_refs.gtf -b samples/sample_B/processings/stringtie/ballgown_ctabs -e samples/sample_B/processings/hisat2_out/sample_B_hisat2.bam writeLines(["linear_expression/linear_quantexp_stringtie/geneexp/samples_expression_files.txt"], ["samples/sample_A/processings/stringtie/sample_A_gene_abund.tab", "samples/sample_B/processings/stringtie/sample_B_gene_abund.tab"]) writeLines(["linear_expression/linear_quantexp_stringtie/geneexp/samples_trxexp_files.txt"], ["samples/sample_A/processings/stringtie/sample_A_transcripts.gtf", "samples/sample_B/processings/stringtie/sample_B_transcripts.gtf"]) echo "No reads in /BIGDATA2/sysu_cyq_1/zhushunxin/CircRNA/circompara2/test_circompara/reads/readsB_1.fastq.gz" > samples/sample_B/read_statistics/fastqc_stats/readsB_1_fastqc.html && echo "No reads in /BIGDATA2/sysu_cyq_1/zhushunxin/CircRNA/circompara2/test_circompara/reads/readsB_1.fastq.gz" > samples/sample_B/read_statistics/fastqc_stats/readsB_1_fastqc/fastqc_data.txt && fastqc /BIGDATA2/sysu_cyq_1/zhushunxin/CircRNA/circompara2/test_circompara/reads/readsB_1.fastq.gz -o samples/sample_B/read_statistics/fastqc_stats --extract > samples/sample_B/read_statistics/fastqc_stats/readsB_1.fastq_fastqc.log 2> samples/sample_B/read_statistics/fastqc_stats/readsB_1.fastq_fastqc.err echo "No reads in /BIGDATA2/sysu_cyq_1/zhushunxin/CircRNA/circompara2/test_circompara/reads/readsB_2.fastq.gz" > samples/sample_B/read_statistics/fastqc_stats/readsB_2_fastqc.html && echo "No reads in /BIGDATA2/sysu_cyq_1/zhushunxin/CircRNA/circompara2/test_circompara/reads/readsB_2.fastq.gz" > samples/sample_B/read_statistics/fastqc_stats/readsB_2_fastqc/fastqc_data.txt && fastqc /BIGDATA2/sysu_cyq_1/zhushunxin/CircRNA/circompara2/test_circompara/reads/readsB_2.fastq.gz -o samples/sample_B/read_statistics/fastqc_stats --extract > samples/sample_B/read_statistics/fastqc_stats/readsB_2.fastq_fastqc.log 2> samples/sample_B/read_statistics/fastqc_stats/readsB_2.fastq_fastqc.err get_stringtie_rawcounts.R -g samples/sample_B/processings/stringtie/sample_B_transcripts.gtf -f /BIGDATA2/sysu_cyq_1/zhushunxin/CircRNA/circompara2/test_circompara/analysis/samples/sample_B/read_statistics/fastqc_stats/readsB_1_fastqc/fastqc_data.txt,/BIGDATA2/sysu_cyq_1/zhushunxin/CircRNA/circompara2/test_circompara/analysis/samples/sample_B/read_statistics/fastqc_stats/readsB_2_fastqc/fastqc_data.txt -o samples/sample_B/processings/stringtie/sample_B_ Error in strsplit(grep("Sequence length", x = fastqc_data.txt, value = T), : subscript out of bounds Calls: mean -> sapply -> lapply -> FUN -> mean -> strsplit Execution halted scons: *** [samples/sample_B/processings/stringtie/sample_B_gene_expression_rawcounts.csv] Error 1 scons: building terminated because of errors.
The R version is 3.6.3, I am looking forward to your reply.
@Kingatsu how did you solve the STAR issue ?? I am facing the same problem in test run. WARNING: File samples/sample_A/processings/circRNAs/star_out/Chimeric.out.junction is empty! Junction files seem empty, skipping circRNA detection module. circRNA detection skipped due to empty junction files Filter mode for detected circRNAs enabled without detection module. Combine with -f or -D. scons: *** [samples/sample_A/processings/circRNAs/dcc/CircRNACount] Error 1 scons: building terminated because of errors.
Hello @egaffo ,I solved the STAR problem when I ran it in / , but now it has a new problem in the test running. Here is the error:
gene_annotation.R -c circular_expression/circrna_collection/combined_circrnas.gtf.gz -o circular_expression/circrna_collection/circrna_gene_annotation Error in get(genname, envir = envir) : object 'testthat_print' not found stringtie -p 24 -o samples/sample_A/processings/stringtie/sample_A_transcripts.gtf -A samples/sample_A/processings/stringtie/sample_A_gene_abund.tab -l sample_A -G /BIGDATA2/sysu_cyq_1/zhushunxin/CircRNA/circompara2/test_circompara/annotation/CFLAR_HIPK3.gtf -C samples/sample_A/processings/stringtie/sample_A_cov_refs.gtf -b samples/sample_A/processings/stringtie/ballgown_ctabs -e samples/sample_A/processings/hisat2_out/sample_A_hisat2.bam stringtie -p 24 -o samples/sample_B/processings/stringtie/sample_B_transcripts.gtf -A samples/sample_B/processings/stringtie/sample_B_gene_abund.tab -l sample_B -G /BIGDATA2/sysu_cyq_1/zhushunxin/CircRNA/circompara2/test_circompara/annotation/CFLAR_HIPK3.gtf -C samples/sample_B/processings/stringtie/sample_B_cov_refs.gtf -b samples/sample_B/processings/stringtie/ballgown_ctabs -e samples/sample_B/processings/hisat2_out/sample_B_hisat2.bam writeLines(["linear_expression/linear_quantexp_stringtie/geneexp/samples_expression_files.txt"], ["samples/sample_A/processings/stringtie/sample_A_gene_abund.tab", "samples/sample_B/processings/stringtie/sample_B_gene_abund.tab"]) writeLines(["linear_expression/linear_quantexp_stringtie/geneexp/samples_trxexp_files.txt"], ["samples/sample_A/processings/stringtie/sample_A_transcripts.gtf", "samples/sample_B/processings/stringtie/sample_B_transcripts.gtf"]) echo "No reads in /BIGDATA2/sysu_cyq_1/zhushunxin/CircRNA/circompara2/test_circompara/reads/readsB_1.fastq.gz" > samples/sample_B/read_statistics/fastqc_stats/readsB_1_fastqc.html && echo "No reads in /BIGDATA2/sysu_cyq_1/zhushunxin/CircRNA/circompara2/test_circompara/reads/readsB_1.fastq.gz" > samples/sample_B/read_statistics/fastqc_stats/readsB_1_fastqc/fastqc_data.txt && fastqc /BIGDATA2/sysu_cyq_1/zhushunxin/CircRNA/circompara2/test_circompara/reads/readsB_1.fastq.gz -o samples/sample_B/read_statistics/fastqc_stats --extract > samples/sample_B/read_statistics/fastqc_stats/readsB_1.fastq_fastqc.log 2> samples/sample_B/read_statistics/fastqc_stats/readsB_1.fastq_fastqc.err echo "No reads in /BIGDATA2/sysu_cyq_1/zhushunxin/CircRNA/circompara2/test_circompara/reads/readsB_2.fastq.gz" > samples/sample_B/read_statistics/fastqc_stats/readsB_2_fastqc.html && echo "No reads in /BIGDATA2/sysu_cyq_1/zhushunxin/CircRNA/circompara2/test_circompara/reads/readsB_2.fastq.gz" > samples/sample_B/read_statistics/fastqc_stats/readsB_2_fastqc/fastqc_data.txt && fastqc /BIGDATA2/sysu_cyq_1/zhushunxin/CircRNA/circompara2/test_circompara/reads/readsB_2.fastq.gz -o samples/sample_B/read_statistics/fastqc_stats --extract > samples/sample_B/read_statistics/fastqc_stats/readsB_2.fastq_fastqc.log 2> samples/sample_B/read_statistics/fastqc_stats/readsB_2.fastq_fastqc.err get_stringtie_rawcounts.R -g samples/sample_B/processings/stringtie/sample_B_transcripts.gtf -f /BIGDATA2/sysu_cyq_1/zhushunxin/CircRNA/circompara2/test_circompara/analysis/samples/sample_B/read_statistics/fastqc_stats/readsB_1_fastqc/fastqc_data.txt,/BIGDATA2/sysu_cyq_1/zhushunxin/CircRNA/circompara2/test_circompara/analysis/samples/sample_B/read_statistics/fastqc_stats/readsB_2_fastqc/fastqc_data.txt -o samples/sample_B/processings/stringtie/sample_B_ Error in strsplit(grep("Sequence length", x = fastqc_data.txt, value = T), : subscript out of bounds Calls: mean -> sapply -> lapply -> FUN -> mean -> strsplit Execution halted scons: *** [samples/sample_B/processings/stringtie/sample_B_gene_expression_rawcounts.csv] Error 1 scons: building terminated because of errors.
The R version is 3.6.3, I am looking forward to your reply.
@Kingatsu how did you solve the STAR issue ?? I am facing the same problem in test run. WARNING: File samples/sample_A/processings/circRNAs/star_out/Chimeric.out.junction is empty! Junction files seem empty, skipping circRNA detection module. circRNA detection skipped due to empty junction files Filter mode for detected circRNAs enabled without detection module. Combine with -f or -D. scons: *** [samples/sample_A/processings/circRNAs/dcc/CircRNACount] Error 1 scons: building terminated because of errors.
Well, circompara2 ran smoothly when I installed it in the ROOT directory or in a real Linux server. For reference, I install Ubuntu as a subsystem in my PC and circompara2 can't work well in the other mounted disks. So I guess maybe the differences of file system between mounted disk and ROOT directory cause the issue. I hope my reply can help you.
There are some issues when running STAR from a container or in a shared filesystem (e.g. NFS) because of temporary files.
Setting the --outTmpDir
STAR parameter with a custom directory solved the problem of empty Chimeric.out.junction
.
To do that with the CIrComPara2 container, you'll have to do the following:
Points 1) and 2) can be done from the command line. The command line to launch CirComPara2 will be like this:
#!/bin/bash
## create a new temp dir
MYTMPDIR=$(mktemp -d)
## mind the new tmp dir is mounted as a volume into the container /ttmmpp dir
docker run -u `id -u`:`id -g` --rm -it -v $MYTMPDIR:/ttmmpp -v $(pwd):/data egaffo/circompara2:v0.1.2.1
## delete the tmp dir once finished
trap "rm -rf $MYTMPDIR" EXIT
And the STAR_PARAMS will be set in the vars.py as follows:
STAR_PARAMS = '--outTmpDir /ttmmpp/$SAMPLE '\
'--runRNGseed 123 '\
'--outSJfilterOverhangMin 15 15 15 15 '\
'--alignSJoverhangMin 15 '\
'--alignSJDBoverhangMin 15 '\
'--seedSearchStartLmax 30 '\
'--outFilterScoreMin 1 '\
'--outFilterMatchNmin 1 '\
'--outFilterMismatchNmax 2 '\
'--chimSegmentMin 15 '\
'--chimScoreMin 15 '\
'--chimScoreSeparation 10 '\
'--chimJunctionOverhangMin 15'
Mind that the STAR_PARAM has to specify also the other parameters that are default in CirComPara2 because the STAR_PARAM will overwrite the default values. N.B: the STAR_PARAMS is a one-line Python string; here, I've just split it into multiline to improve readability.
When I run the test, whether
cd test_circompara/analysis ../../circompara2
orcd test_circompara/analysis_se ../../circompara2
, it reported this error. I install circompara2 in conda python=2.7 envs as #3 Alipe2021 did. I am new to this, so I don't know if it's installed properly yet or it is a normal result. Anyone can give me a hand, thanks!