Closed sliva1 closed 6 years ago
Hi @sliva1,
IRFinder is built on GCC 4.9.0
, but it can work on 4.8.5
if C++ 11 features are supported. The empty files are due to the failure of IRFinder genome preparation, which can be caused by either GCC or other capability problems.
Could you please send me the standard errors during the genome preparation stage, so that I may be able to figure what the real problem is?
Best, Dadi
Hi, thanks for the reply ! Here (below) the standard output when I launch IR Quantification, but I think the problem is when I try to construct the reference. I have some empty files in IRFinder directory. DO you want the log files ?
gzip: stdout: Broken pipe tee: standard output: Broken pipe tee: write error /data/kdi_prod/project_result/1127/03.00/IRFinder/IRFinder-master/bin/IRFinder: line 562: 31822 Done "$STAREXEC" --genomeLoad $STARMEMORYMODE --runThreadN $THREADS --genomeDir "$REF/STAR" --outFilterMultimapNmax 1 --outSAMstrandField intronMotif --outFileNamePrefix "${OUTPUTDIR}/" --outSAMunmapped None --outSAMmode NoQS --outSAMtype BAM Unsorted --outStd BAM_Unsorted --readFilesIn "$FIFO1" "$FIFO2" 31823 Exit 1 | tee "$OUTPUTDIR/Unsorted.bam" 31824 Exit 1 | gzip -cd 31825 Aborted (core dumped) | "$LIBEXEC/irfinder" "$OUTPUTDIR" "$REF/IRFinder/ref-cover.bed" "$REF/IRFinder/ref-sj.ref" "$REF/IRFinder/ref-read-continues.ref" "$REF/IRFinder/ref-ROI.bed" "$OUTPUTDIR/unsorted.frag.bam" >> "$OUTPUTDIR/irfinder.stdout" 2>> "$OUTPUTDIR/irfinder.stderr" ERROR: IRFinder appears not to have completed. It appears an unknown component crashed. ERROR: IRFinder appears not to have completed. It appears an unknown component crashed. ERROR: IRFinder appears not to have completed. It appears an unknown component crashed.
De : Dadi notifications@github.com Envoyé : mardi 5 juin 2018 16:13:09 À : williamritchie/IRFinder Cc : Liva Stephane; Mention Objet : Re: [williamritchie/IRFinder] empty file and IR quantification error (#41)
Hi @sliva1https://github.com/sliva1,
IRFinder is built on GCC 4.9.0, but it can work on 4.8.5 if C++ 11 features are supported. The empty files are due to the failure of IRFinder genome preparation, which can be caused by either GCC or other capability problems.
Could you please send me the standard errors during the genome preparation stage, so that I may be able to figure what the real problem is?
Best, Dadi
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/williamritchie/IRFinder/issues/41#issuecomment-394724706, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AmGigTPu-aEkxIaDdFDDrn6kmimM7SXQks5t5pH1gaJpZM4UaqWe.
Hi @sliva1,
The quantification failed because of the empty reference file. We have to check what happened during the reference preparation stage to figure out the solution. Unfortunately, the current IRFinder doesn't save a log file during reference preparation. You have to re-run reference build mode and send me all the screen information. Thank you!
Best, Dadi
Hi Dadi,
Ok thanks, I re-run the reference, I will have the information tomorrow (I'm in Paris and it is 5pm !!).
I send you all the information when the run finish !
Thanks again and have a good day!
Best
Stef
De : Dadi notifications@github.com Envoyé : mardi 5 juin 2018 16:44:39 À : williamritchie/IRFinder Cc : Liva Stephane; Mention Objet : Re: [williamritchie/IRFinder] empty file and IR quantification error (#41)
Hi @sliva1https://github.com/sliva1,
The quantification failed because of the empty reference file. We have to check what happened during the reference preparation stage to figure out the solution. Unfortunately, the current IRFinder doesn't save a log file during reference preparation. You have to re-run reference build mode and send me all the screen information. Thank you!
Best, Dadi
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/williamritchie/IRFinder/issues/41#issuecomment-394736685, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AmGigTrjudxnvvXbxIhp3krE1YOIuTWYks5t5plXgaJpZM4UaqWe.
Hi Dadi, Ok all works ! I integrate IRFinder in our home pipeline! I have a question regarding the detection of adaptator. You can detect the adaptator only in a paired-end experiment (and not in a single-end)? Cause I see that you can carry out the adaptator in a single-end experiment when you launch IR quantifcation but how do you know which adaptator you have to remove ? I don't know if I'm clear !!
Hi @sliva1 ,
For pair-end, IRFinder can automatically determine the most likely adapter sequence to be trimmed, taking advantage of read pairs that overlap each other. For single-end, this approach cannot be applied and user has to feed IRFinder with the correct adaptor manually (-a
option). Otherwise, IRFinder will trim Illumina universal adaptors.
Best, Dadi
Hi Dadi, Thanks for the answer ! I have questions:
Here the output of creating mm9 reference:
bin/IRFinder -m BuildRef -r REF/Mouse-mm9-release67 -e REF/extra-input-files/RNA.SpikeIn.ERCC.fasta.gz -R REF/extra-input-files/Mouse_mm9_nonPolyA_ROI.bed ftp://ftp.ensembl.org/pub/release-67/gtf/mus_musculus/Mus_musculus.NCBIM37.67.gtf.gz Launching reference build process. The full build should take at least one hour. Usage : /data/kdi_prod/.kdi/project_workspace_0/1127/acl/03.00/IRFinder/IRFinder-master/bin/util/IRFinder-BuildRefFromEnsembl mode threads STAR-executable base_ftp_url_of_ensembl_genome+gtf output_directory(must not exist) additional_genome_reference(eg: ERCC) non_polyA_genes-as-bed region_blacklist-as-bed Usage example: /data/kdi_prod/.kdi/project_workspace_0/1127/acl/03.00/IRFinder/IRFinder-master/bin/util/IRFinder-BuildRefFromEnsembl BuildRef 12 STAR "ftp://ftp.ensembl.org/pub/release-75/fasta/homo_sapiens/dna/" "IRFinder/REF/Human" "Refernce-ERCC.fa.gz" [non_polyA_genes.bed] [blacklist.bed] Trying to fetch dna.primary_assembly and GTF based on: ftp://ftp.ensembl.org/pub/release-67/gtf/mus_musculus/Mus_musculus.NCBIM37.67.gtf.gz
--2018-06-28 11:07:41-- ftp://ftp.ensembl.org/pub/release-67/fasta/mus_musculus/dna/*.dna.primary_assembly.fa.gz => '.listing' Resolving ftp.ensembl.org (ftp.ensembl.org)... 193.62.193.8 Connecting to ftp.ensembl.org (ftp.ensembl.org)|193.62.193.8|:21... connected. Logging in as anonymous ... Logged in! ==> SYST ... done. ==> PWD ... done. ==> TYPE I ... done. ==> CWD (1) /pub/release-67/fasta/mus_musculus/dna ... done. ==> PASV ... done. ==> LIST ... done.
[ <=> ] 5,334 --.-K/s in 0.09s
2018-06-28 11:07:41 (57.4 KB/s) - '.listing' saved [5334]
Removed '.listing'. No matches on pattern '.dna.primary_assembly.fa.gz'. --2018-06-28 11:07:41-- ftp://ftp.ensembl.org/pub/release-67/fasta/mus_musculus/dna/.dna.toplevel.fa.gz => '.listing' Resolving ftp.ensembl.org (ftp.ensembl.org)... 193.62.193.8 Connecting to ftp.ensembl.org (ftp.ensembl.org)|193.62.193.8|:21... connected. Logging in as anonymous ... Logged in! ==> SYST ... done. ==> PWD ... done. ==> TYPE I ... done. ==> CWD (1) /pub/release-67/fasta/mus_musculus/dna ... done. ==> PASV ... done. ==> LIST ... done.
[ <=> ] 5,334 --.-K/s in 0s
2018-06-28 11:07:42 (271 MB/s) - '.listing' saved [5334]
Removed '.listing'. --2018-06-28 11:07:42-- ftp://ftp.ensembl.org/pub/release-67/fasta/mus_musculus/dna/Mus_musculus.NCBIM37.67.dna.toplevel.fa.gz => 'Mus_musculus.NCBIM37.67.dna.toplevel.fa.gz' ==> CWD not required. ==> PASV ... done. ==> RETR Mus_musculus.NCBIM37.67.dna.toplevel.fa.gz ... done. Length: 764264371 (729M)
100%[=========================================>] 764,264,371 21.2MB/s in 30s
2018-06-28 11:08:14 (24.3 MB/s) - 'Mus_musculus.NCBIM37.67.dna.toplevel.fa.gz' saved [764264371]
--2018-06-28 11:08:14-- ftp://ftp.ensembl.org/pub/release-67/gtf/mus_musculus/Mus_musculus.NCBIM37.67.gtf.gz => 'Mus_musculus.NCBIM37.67.gtf.gz' Resolving ftp.ensembl.org (ftp.ensembl.org)... 193.62.193.8 Connecting to ftp.ensembl.org (ftp.ensembl.org)|193.62.193.8|:21... connected. Logging in as anonymous ... Logged in! ==> SYST ... done. ==> PWD ... done. ==> TYPE I ... done. ==> CWD (1) /pub/release-67/gtf/mus_musculus ... done. ==> SIZE Mus_musculus.NCBIM37.67.gtf.gz ... 12773886 ==> PASV ... done. ==> RETR Mus_musculus.NCBIM37.67.gtf.gz ... done. Length: 12773886 (12M) (unauthoritative)
100%[=========================================>] 12,773,886 12.6MB/s in 1.0s
2018-06-28 11:08:16 (12.6 MB/s) - 'Mus_musculus.NCBIM37.67.gtf.gz' saved [12773886]
Jun 28 11:08:54 ..... started STAR run Jun 28 11:08:54 ... starting to generate Genome files Jun 28 11:08:54 ... starting to sort Suffix Array. This may take a long time... Jun 28 11:08:54 ... sorting Suffix Array chunks and saving them to disk... Jun 28 11:08:54 ... loading chunks from disk, packing SA... Jun 28 11:08:54 ... finished generating suffix array Jun 28 11:08:54 ... generating Suffix Array index Jun 28 11:08:57 ... completed Suffix Array index Jun 28 11:08:57 ..... processing annotations GTF
Fatal INPUT FILE error, no exon lines in the GTF file: /data/kdi_prod/.kdi/project_workspace_0/1127/acl/03.00/IRFinder/IRFinder-master/REF/Mouse-mm9-release67/transcripts.gtf Solution: check the formatting of the GTF file, it must contain some lines with exon in the 3rd column. Make sure the GTF file is unzipped. If exons are marked with a different word, use --sjdbGTFfeatureExon .
Jun 28 11:08:57 ...... FATAL ERROR, exiting Star genome build result: 26624 Commence STAR mapping run for mapability. Thu Jun 28 11:08:57 CEST 2018
EXITING because of FATAL ERROR: could not open genome file /data/kdi_prod/.kdi/project_workspace_0/1127/acl/03.00/IRFinder/IRFinder-master/REF/Mouse-mm9-release67/STAR/genomeParameters.txt SOLUTION: check that the path to genome files, specified in --genomeDir is correct and the files are present, and have user read permsissions
Jun 28 11:08:57 ...... FATAL ERROR, exiting
real 0m0.026s user 0m0.002s sys 0m0.003s Completed STAR run. Thu Jun 28 11:08:57 CEST 2018 Commence Coverage calculation. ls: cannot access tmp_by_chr_11865/*.bed.gz: No such file or directory
real 0m0.007s user 0m0.001s sys 0m0.002s cat: tmp_by_chr_11865/*.exclusion: No such file or directory
real 0m0.006s user 0m0.001s sys 0m0.003s rm: cannot remove 'tmp_by_chr_11865/bed.gz.exclusion': No such file or directory rm: cannot remove 'tmp_by_chr_11865/bed.gz': No such file or directory Completed coverage exclusion calculation. Thu Jun 28 11:08:57 CEST 2018 Mapability result: 0 Build Ref 1 Build Ref 2 Build Ref 3 Build Ref 4 Build Ref 5 Build Ref 6 Build Ref 7 Build Ref 8 Build Ref 9 Build Ref 10 Build Ref 11 Build Ref 12 Build Ref 13b Build Ref 14b Build Ref 15b Build Ref 16 - COMPLETE Ref build result: 0 ALL DONE
Hi Dadi, I have a question regarding the trim function. In the documentation, you write that we have to unzip fastq when running quantification to trim adaptator. I saw in the IRFinder main script that you unzip fastq to trim adaptator, so do we have to unzip file first before running quantification to remove adaptator ? best Stef
Hi Stef,
To run the script trim
manually/alone, you would have to feed it with unzipped FASTQs. If you're talking about trim
called during IRFinder
, it will take care of the unzip process if your input are gzipped FASTQs. Apologize for a late reply.
Best, Dadi
Hi, I have the following message when I try to launch IR quantification 'see below). I construct the reference as shown on the wiki. But in my IRFinder directory I have some empty files ?
-bash-4.2$ wc -l * 0 exclude.directional.bed 411 exclude.omnidirectional.bed 0 intergenic.ROI.bed 294679 introns.unique.bed 0 ref-cover.bed 0 ref-read-continues.ref 117 ref-ROI.bed 294679 ref-sj.ref 589886 total
And last I thought that I had the good version of gcc but the gcc version I have is 4.8.5... So I think that I have to change the version and do you think that could be explain the error ? Best and thanks ! Stef
Message from IR quantification : gzip: stdout: Broken pipe tee: standard output: Broken pipe tee: write error /data/kdi_prod/project_result/1127/03.00/IRFinder/IRFinder-master/bin/IRFinder: line 562: 31822 Done "$STAREXEC" --genomeLoad $STARMEMORYMODE --runThreadN $THREADS --genomeDir "$REF/STAR" --outFilterMultimapNmax 1 --outSAMstrandField intronMotif --outFileNamePrefix "${OUTPUTDIR}/" --outSAMunmapped None --outSAMmode NoQS --outSAMtype BAM Unsorted --outStd BAM_Unsorted --readFilesIn "$FIFO1" "$FIFO2" 31823 Exit 1 | tee "$OUTPUTDIR/Unsorted.bam" 31824 Exit 1 | gzip -cd 31825 Aborted (core dumped) | "$LIBEXEC/irfinder" "$OUTPUTDIR" "$REF/IRFinder/ref-cover.bed" "$REF/IRFinder/ref-sj.ref" "$REF/IRFinder/ref-read-continues.ref" "$REF/IRFinder/ref-ROI.bed" "$OUTPUTDIR/unsorted.frag.bam" >> "$OUTPUTDIR/irfinder.stdout" 2>> "$OUTPUTDIR/irfinder.stderr" ERROR: IRFinder appears not to have completed. It appears an unknown component crashed. ERROR: IRFinder appears not to have completed. It appears an unknown component crashed. ERROR: IRFinder appears not to have completed. It appears an unknown component crashed.