egaffo / circompara2

Improved bioinformatic pipeline to identify and quantify circRNA expression from RNA-seq data by combining multiple circRNA detection methods
Other
7 stars 0 forks source link

scons: building terminated because of errors. #15

Open pecoraro90 opened 1 year ago

pecoraro90 commented 1 year ago

Hi egaffo, I am having the following error message:

[M::mem_process_seqs] Processed 1568628 reads in 49.756 CPU sec, 12.022 real sec [M::process] read 1568628 sequences (80000028 bp)... [M::mem_process_seqs] Processed 1568628 reads in 48.726 CPU sec, 8.547 real sec [M::process] read 1568628 sequences (80000028 bp)... [M::mem_process_seqs] Processed 1568628 reads in 51.428 CPU sec, 13.630 real sec [M::process] read 1568628 sequences (80000028 bp)... [M::mem_process_seqs] Processed 1568628 reads in 50.021 CPU sec, 13.461 real sec [M::process] read 1215784 sequences (62004984 bp)... [M::mem_process_seqs] Processed 1215784 reads in 38.014 CPU sec, 7.662 real sec [main] Version: 0.7.15-r1140 [main] CMD: bwa mem -t 8 -T 19 /data/dbs/indexes/indexes/bwa/Homo_sapiens.hg38.dna_sm.chromosome.Y /data/samples/S1/processings/hisat2_out/S1_unmapped.fastq.gz [main] Real time: 188.473 sec; CPU: 654.355 sec [SEGEMEHL] Tue Feb 7 10:37:11 2023: threaded matching w/ suffixarray has taken 1046.000000 seconds. [SEGEMEHL] Tue Feb 7 10:37:11 2023: Mapping stats: total mapped (%) unique (%) multi (%) split (%) all 20039320 957 0.00% 923 0.00% 34 0.00% 861 0.00% [SEGEMEHL] Tue Feb 7 10:37:11 2023: Goodbye. "Hab' ich gerade inner Bild gelesen!" (Bienchen) scons: building terminated because of errors.

I am running the pipeline with only one chr. Hovewer, I've alredy run the pipeline successfully with other samples (same chr genome, same chr gtf, ecc...). I have no idea what is going on, the error is very generic and . Could you help me with this?

egaffo commented 1 year ago

The error is not in the message you posted. You probably run parallel tasks (i.e. using the "-j X" option with X > 1), so the actual error appeared above the message you posted. You can either scroll up the log and seek for an error message or run CirComPara2 again but with "-j 1", so it will execute just one task at a time and stop right after the failing task. N.B. do not clear your project dir before re-running CirComPara2 as it will perform only the "missing" tasks, which will save you time.

pecoraro90 commented 1 year ago

I tried and it gave me this error:

Feb 07 13:13:41 ..... started sorting BAM Feb 07 13:13:42 ..... finished successfully samtools view -F 4 samples/S1/processings/circRNAs/star_out/Aligned.sortedByCoord.out.bam | cut -f 1 | sort | uniq | wc -l > samples/S1/processings/circRNAs/star_out/STAR_mapped_reads_count.txt DCC -fg -M -F -Nr 1 1 -N -T 8 -D -O samples/S1/processings/circRNAs/dcc -t samples/S1/processings/circRNAs/dcc/_tmp_DCC samples/S1/processings/circRNAs/star_out/Chimeric.out.junction Output folder samples/S1/processings/circRNAs/dcc already exists, reusing DCC 0.4.8 started 8 CPU cores available, using 8 WARNING: non-stranded data, the strand of circRNAs guessed from the strand of host genes started circRNA detection from file samples/S1/processings/circRNAs/star_out/Chimeric.out.junction => locating circRNAs (unstranded mode) [samples/S1/processings/circRNAs/star_out/Chimeric.out.junction] => sorting circRNAs (unstranded mode) [samples/S1/processings/circRNAs/star_out/Chimeric.out.junction] finished circRNA detection from file samples/S1/processings/circRNAs/star_out/Chimeric.out.junction Combining individual circRNA read counts Using files _tmp_DCC/tmp_circCount and _tmp_DCC/tmp_coordinates for filtering Filtering by read counts Remove ChrM Not deleting output folder samples/S1/processings/circRNAs/dcc/: contains files Temporary files deleted dcc_fix_strand.R -c samples/S1/processings/circRNAs/dcc/CircRNACount -d samples/S1/processings/circRNAs/dcc/CircCoordinates -o samples/S1/processings/circRNAs/dcc/strandedCircRNACount dcc_compare.R -l S1 -i samples/S1/processings/circRNAs/dcc/strandedCircRNACount -o circular_expression/circrna_collection/merged_samples_circrnas/dcc_compared.csv filter_segemehl.R -i samples/S1/processings/circRNAs/segemehl/S1_unmapped.fastq.sngl.bed -t samples/S1/processings/circRNAs/segemehl/S1_unmapped.fastq.trns.txt -q median_1 -o samples/S1/processings/circRNAs/testrealign/splicesites.bed -r samples/S1/processings/circRNAs/testrealign/S1.circular.reads.bed.gz -l samples/S1/processings/circRNAs/testrealign/S1.old.segemehl.format.bed Warning message: In fread(cmd = paste0("grep \";B\|C;\" ", input), header = F, skip = 1) : File '/tmp/RtmpbrRHEE/file28eab30dff' has size 0. Returning a NULL data.table. Error in eval(bysub, x, parent.frame()) : object 'chr' not found Calls: write.table -> is.data.frame -> [ -> [.data.table -> eval -> eval Execution halted scons: *** [samples/S1/processings/circRNAs/testrealign/splicesites.bed] Error 1

scons: building terminated because of errors.

egaffo commented 1 year ago

I see the filter_segemehl.R script failed, probably because Segemehl mapped no reads in your chromosome (see the "Mapping stats" in your previous post), or it found no backspliced reads at last. You may try dropping segemehl and circexplorer_se from the circRNA methods or add the "-k" option to the command line (f.i. after the -j, "-j 4 -k"); it tells CirComPara2 to continue running as much as possible even when some error occurs). As an extreme trial, use "-i" option instead of -k. The -i force the running to continue regardless of some task failed...but you probably won't get reliable results in this way. Sorry, CirComPara2 does not handle these situations gently... If there are no circRNA reads mapping into your chromosome, CirComPara2 will eventually crash; it does not really give an empty output if nothing is found.

pecoraro90 commented 1 year ago

I Enrico, I understand. i was starting to suspect this given that my experimental design is very narrow and my chances to find some circular in that chr are very low. Thank you anyway! I guess you are Italian, aren't you? I am Italian too!

egaffo commented 1 year ago

Spero che CirComPara2 ti sia stato utile. In bocca al lupo per i tuoi esperimenti e le tue analisi!