What bowtie is it using? #2

Open Damtagor opened 5 years ago

Damtagor commented 5 years ago

The command that I am using:

perl GFusion.pl.txt -o output1 -r 0 -p 12 -i /mnt/home/soft/human/data/hg38_illumina/Homo_sapiens/NCBI/GRCh38/Sequence/Bowtie2Index/genome -g /mnt/home/soft/human/data/hg38_illumina/Homo_sapiens/NCBI/GRCh38/Annotation/Genes/genes.gtf -1 test_1.fastq -2 test_2.fastq

In half of the execution, it throws an error because bowtie executed unrecognized option '--reorder' (an option of bowtie2). When I try to use bowtie2 indexes, tophat doesn't recognize them because it is searching bowtie indexes only. How have you solved this?

[Tue Dec  4 11:09:40 2018]

[2018-12-04 11:09:40] Beginning TopHat run (v2.1.0)
[2018-12-04 11:09:40] Checking for Bowtie
                  Bowtie version:
[2018-12-04 11:09:40] Checking for Bowtie index files (genome)..
[2018-12-04 11:09:40] Checking for reference FASTA file
[2018-12-04 11:09:40] Generating SAM header for /mnt/home/soft/human/data/hg38_illumina/Homo_sapiens/NCBI/GRCh38/Sequence/BowtieIndex/genome
[2018-12-04 11:10:01] Preparing reads
         left reads: min. length=50, max. length=50, 84131 kept reads (113 discarded)
        right reads: min. length=50, max. length=50, 83725 kept reads (519 discarded)
[2018-12-04 11:10:04] Mapping left_kept_reads to genome genome with Bowtie
Error running bowtie:
/mnt/home/soft/bowtie/programs/x86_64/1.0/bowtie: unrecognized option '--reorder'
Command: /mnt/home/soft/bowtie/programs/x86_64/1.0/bowtie -v 2 -k 20 -m 20 -S -p 12 --reorder --sam-nohead --max /dev/null /mnt/home/soft/human/data/hg38_illumina/Homo_sapiens/NCBI/GRCh38/Sequence/BowtieIndex/genome -

open: No such file or directory
[main_samview] fail to open "output1/accepted_hits.bam" for reading.
open: No such file or directory
[main_samview] fail to open "output1/unmapped.bam" for reading.
[Tue Dec  4 11:10:04 2018]
Warning: Could not find any reads in "output1/un.fastq"
# reads processed: 0
# reads with at least one reported alignment: 0 (0.00%)
# reads that failed to align: 0 (0.00%)
No alignments
[samopen] SAM header is present: 195 sequences.
[sam_read1] reference 'ID:Bowtie        VN:1.0.0        CL:"bowtie -p 12 /mnt/home/soft/human/data/hg38_illumina/Homo_sapiens/NCBI/GRCh38/Sequence/BowtieIndex/genome output1/un.fastq -S output1/fusion_out/un.sam"
@SQ     SN:chr3 LN:198295559
@SQ     SN:chr4 LN:190214555
@SQ     SN:chr5 LN:181538259
@SQ     !' is recognized as '*'.
[main_samview] truncated file.
[samopen] SAM header is present: 195 sequences.
[sam_read1] reference 'ID:Bowtie        VN:1.0.0        CL:"bowtie -p 12 /mnt/home/soft/human/data/hg38_illumina/Homo_sapiens/NCBI/GRCh38/Sequence/BowtieIndex/genome output1/un.fastq -S output1/fusion_out/un.sam"
@SQ     SN:chr3 LN:198295559
@SQ     SN:chr4 LN:190214555
@SQ     SN:chr5 LN:181538259
@SQ!' is recognized as '*'.
[main_samview] truncated file.
open: No such file or directory
[main_samview] fail to open "output1/accepted_hits.bam" for reading.
open: No such file or directory
[main_samview] fail to open "output1/accepted_hits.bam" for reading.
[samopen] no @SQ lines in the header.
[sam_read1] missing header? Abort!
[bam_header_read] EOF marker is absent. The input is probably truncated.
[Tue Dec  4 11:10:07 2018]
 Result: No Fusion Genes!  The time elapsed: about 0 hours.
zhaodoctor commented 5 years ago

Thank for using our tool! GFusion must use the index of bowtie 1 version. However, I found that you use the bowtie2Index in your cammand: NCBI/GRCh38/Sequence/Bowtie2Index/genome. I think you can replace it and try again. If you don't have the bowtie1Index, you should first use bowtie-build command to generate the correspond files according to the manual of bowtie. If you have any question, please do not hestitate to contact me immediately.

Damtagor commented 5 years ago

Thanks a lot for answering. I did it already. Sorry for this. Now I see that I posted an incomplete explanation. When I use bowtie 1 indexes, the script throws an error because Bowtie doesn't recognize the option --reorder which is exclusive of Bowtie2. I will post the full error report in a few hours.

Damtagor commented 5 years ago

Excuse me. I checked the comment and I found that the command posted was the wrong one. I will make it clear now.

I used this command:

perl GFusion.pl.txt -o output1 -r 0 -p 12 -i /mnt/home/soft/human/data/hg38_illumina/Homo_sapiens/NCBI/GRCh38/Sequence/BowtieIndex/genome -g /mnt/home/soft/human/data/hg38_illumina/Homo_sapiens/NCBI/GRCh38/Annotation/Genes/genes.gtf -1 test_1.fastq -2 test_2.fastq

Then, an error appeared because Bow tie didn't recognize the option --reorder:

[Tue Dec  4 18:57:08 2018]

[2018-12-04 18:57:08] Beginning TopHat run (v2.1.0)
[2018-12-04 18:57:08] Checking for Bowtie
                  Bowtie version:
[2018-12-04 18:57:10] Checking for Bowtie index files (genome)..
[2018-12-04 18:57:10] Checking for reference FASTA file
[2018-12-04 18:57:10] Generating SAM header for /mnt/home/soft/human/data/hg38_illumina/Homo_sapiens/NCBI/GRCh38/Sequence/BowtieIndex/genome
[2018-12-04 18:57:31] Preparing reads
         left reads: min. length=50, max. length=50, 84131 kept reads (113 discarded)
        right reads: min. length=50, max. length=50, 83725 kept reads (519 discarded)
[2018-12-04 18:57:33] Mapping left_kept_reads to genome genome with Bowtie
Error running bowtie:
bowtie: unrecognized option '--reorder'
Command: bowtie --wrapper basic-0 -v 2 -k 20 -m 20 -S -p 12 --reorder --sam-nohead --max /dev/null /mnt/home/soft/human/data/hg38_illumina/Homo_sapiens/NCBI/GRCh38/Sequence/BowtieIndex/genome -

open: No such file or directory
[main_samview] fail to open "output1/accepted_hits.bam" for reading.
open: No such file or directory
[main_samview] fail to open "output1/unmapped.bam" for reading.
[Tue Dec  4 18:57:33 2018]
Warning: Could not find any reads in "output1/un.fastq"
# reads processed: 0
# reads with at least one reported alignment: 0 (0.00%)
# reads that failed to align: 0 (0.00%)
No alignments
[samopen] SAM header is present: 195 sequences.
[sam_read1] reference 'ID:Bowtie        VN:1.1.2        CL:"bowtie --wrapper basic-0 -p 12 /mnt/home/soft/human/data/hg38_illumina/Homo_sapiens/NCBI/GRCh38/Sequence/BowtieIndex/genome output1/un.fastq -S output1/fusion_out/un.sam"
r3      LN:198295559
@SQ     SN:chr4 LN:190214555
@SQ     SN:chr5 LN:181538259
@SQ     !' is recognized as '*'.
[main_samview] truncated file.
[samopen] SAM header is present: 195 sequences.
[sam_read1] reference 'ID:Bowtie        VN:1.1.2        CL:"bowtie --wrapper basic-0 -p 12 /mnt/home/soft/human/data/hg38_illumina/Homo_sapiens/NCBI/GRCh38/Sequence/BowtieIndex/genome output1/un.fastq -S output1/fusion_out/un.sam"
hr3     LN:198295559
@SQ     SN:chr4 LN:190214555
@SQ     SN:chr5 LN:181538259
@SQ!' is recognized as '*'.
[main_samview] truncated file.
open: No such file or directory
[main_samview] fail to open "output1/accepted_hits.bam" for reading.
open: No such file or directory
[main_samview] fail to open "output1/accepted_hits.bam" for reading.
[samopen] no @SQ lines in the header.
[sam_read1] missing header? Abort!
[bam_header_read] EOF marker is absent. The input is probably truncated.
[Tue Dec  4 18:57:37 2018]
 Result: No Fusion Genes!  The time elapsed: about 0 hours.

After this, I used Bowtie2 indexes:

perl GFusion.pl.txt -o output1 -r 0 -p 12 -i /mnt/home/soft/human/data/hg38_illumina/Homo_sapiens/NCBI/GRCh38/Sequence/Bowtie2Index/genome -g /mnt/home/soft/human/data/hg38_illumina/Homo_sapiens/NCBI/GRCh38/Annotation/Genes/genes.gtf -1 test_1.fastq -2 test_2.fastq

But the script doesn't use that type of indexes:

[Tue Dec  4 19:01:33 2018]

[2018-12-04 19:01:33] Beginning TopHat run (v2.1.0)
[2018-12-04 19:01:33] Checking for Bowtie
                  Bowtie version:
[2018-12-04 19:01:33] Checking for Bowtie index files (genome)..
Error: Could not find Bowtie index files (/mnt/home/soft/human/data/hg38_illumina/Homo_sapiens/NCBI/GRCh38/Sequence/Bowtie2Index/genome.*.ebwt)
open: No such file or directory
[main_samview] fail to open "output1/accepted_hits.bam" for reading.
open: No such file or directory
[main_samview] fail to open "output1/unmapped.bam" for reading.
[Tue Dec  4 19:01:33 2018]
Could not locate a Bowtie index corresponding to basename "/mnt/home/soft/human/data/hg38_illumina/Homo_sapiens/NCBI/GRCh38/Sequence/Bowtie2Index/genome"
Command: bowtie --wrapper basic-0 -p 12 -S /mnt/home/soft/human/data/hg38_illumina/Homo_sapiens/NCBI/GRCh38/Sequence/Bowtie2Index/genome output1/un.fastq output1/fusion_out/un.sam
[samopen] SAM header is present: 195 sequences.
[sam_read1] reference 'ID:Bowtie        VN:1.1.2        CL:"bowtie --wrapper basic-0 -p 12 /mnt/home/soft/human/data/hg38_illumina/Homo_sapiens/NCBI/GRCh38/Sequence/BowtieIndex/genome output1/un.fastq -S output1/fusion_out/un.sam"
r3      LN:198295559
@SQ     SN:chr4 LN:190214555
@SQ     SN:chr5 LN:181538259
@SQ     !' is recognized as '*'.
[main_samview] truncated file.
[samopen] SAM header is present: 195 sequences.
[sam_read1] reference 'ID:Bowtie        VN:1.1.2        CL:"bowtie --wrapper basic-0 -p 12 /mnt/home/soft/human/data/hg38_illumina/Homo_sapiens/NCBI/GRCh38/Sequence/BowtieIndex/genome output1/un.fastq -S output1/fusion_out/un.sam"
hr3     LN:198295559
@SQ     SN:chr4 LN:190214555
@SQ     SN:chr5 LN:181538259
@SQ!' is recognized as '*'.
[main_samview] truncated file.
open: No such file or directory
[main_samview] fail to open "output1/accepted_hits.bam" for reading.
open: No such file or directory
[main_samview] fail to open "output1/accepted_hits.bam" for reading.
[samopen] no @SQ lines in the header.
[sam_read1] missing header? Abort!
[bam_header_read] EOF marker is absent. The input is probably truncated.
[Tue Dec  4 19:01:35 2018]
 Result: No Fusion Genes!  The time elapsed: about 0 hours.

I don't know how to solve this situation exactly. Thanks once again and sorry for the inconvenience.

zhaodoctor commented 5 years ago

I'm afraid it's beyond my ability, because the GFusion ran fine when I tested it with tophat (v2.1.0) and bowtie (v1.1.2).

[Fri Dec  7 15:00:41 2018]

[2018-12-07 15:00:45] Beginning TopHat run (v2.1.0)
[2018-12-07 15:00:45] Checking for Bowtie
Bowtie version:
[2018-12-07 15:00:48] Checking for Bowtie index files (genome)..
[2018-12-07 15:00:48] Checking for reference FASTA file
[2018-12-07 15:00:48] Generating SAM header for bowtie1_hg19/hg19
[2018-12-07 15:08:02] Preparing reads
left reads: min. length=50, max. length=50, 84131 kept reads (113 discarded)
right reads: min. length=50, max. length=50, 83725 kept reads (519 discarded)
[2018-12-07 15:08:09] Mapping left_kept_reads to genome hg19 with Bowtie
[2018-12-07 15:08:21] Mapping left_kept_reads_seg1 to genome hg19 with Bowtie (1/2)
[2018-12-07 15:08:28] Mapping left_kept_reads_seg2 to genome hg19 with Bowtie (2/2)
[2018-12-07 15:08:34] Mapping right_kept_reads to genome hg19 with Bowtie
[2018-12-07 15:08:47] Mapping right_kept_reads_seg1 to genome hg19 with Bowtie (1/2)
[2018-12-07 15:08:52] Mapping right_kept_reads_seg2 to genome hg19 with Bowtie (2/2)
[2018-12-07 15:08:58] Searching for junctions via segment mapping
[2018-12-07 15:22:18] Retrieving sequences for splices
[2018-12-07 15:26:28] Indexing splices
[2018-12-07 15:26:37] Mapping left_kept_reads_seg1 to genome segment_juncs with Bowtie (1/2)
[2018-12-07 15:26:39] Mapping left_kept_reads_seg2 to genome segment_juncs with Bowtie (2/2)
[2018-12-07 15:26:41] Joining segment hits
[2018-12-07 15:29:36] Mapping right_kept_reads_seg1 to genome segment_juncs with Bowtie (1/2)
[2018-12-07 15:29:39] Mapping right_kept_reads_seg2 to genome segment_juncs with Bowtie (2/2)
[2018-12-07 15:29:41] Joining segment hits
[2018-12-07 15:32:12] Reporting output tracks
[2018-12-07 15:35:25] A summary of the alignment counts can be found in outfile/align_summary.txt
[2018-12-07 15:35:25] Run complete: 00:34:40 elapsed
[Fri Dec  7 15:36:14 2018]
# reads processed: 33598
# reads with at least one reported alignment: 31889 (94.91%)
# reads that failed to align: 1709 (5.09%)
Reported 31889 alignments to 1 output stream(s)

[2018-12-07 15:42:18] Beginning TopHat run (v2.1.0)
[2018-12-07 15:42:18] Checking for Bowtie
Bowtie version:
[2018-12-07 15:42:19] Checking for Bowtie index files (genome)..
[2018-12-07 15:42:19] Checking for reference FASTA file
[2018-12-07 15:42:19] Generating SAM header for outfile/fusion_out/ref/index/re
[2018-12-07 15:42:20] Preparing reads
left reads: min. length=50, max. length=50, 148 kept reads (0 discarded)
right reads: min. length=50, max. length=50, 148 kept reads (0 discarded)
[2018-12-07 15:42:21] Mapping left_kept_reads to genome re with Bowtie
[2018-12-07 15:42:22] Mapping right_kept_reads to genome re with Bowtie
Warning: junction database is empty!
[2018-12-07 15:42:24] Reporting output tracks
[2018-12-07 15:42:25] A summary of the alignment counts can be found in outfile/fusion_out/final/align_summary.txt
[2018-12-07 15:42:25] Run complete: 00:00:07 elapsed
[Fri Dec  7 15:42:26 2018] Completed successfully!  The time elapsed: about 0.69 hours.

The option '--reorder' was not written in the code of GFusion, and I found that this error occurred when running the command:

tophat -o out_file --bowtie1 -p 12 -r 0 -I100000 --no-coverage-search /path/to/bowtie1_index PE_reads_1.fastq -2 PE_reads_2.fastq

You can run the above command, and if you got the same 'Error information', then this error is due to that. And I searched 'bowtie --reorder' and 'bowtie2 --reorder' by google, the option '--reorder' belongs to bowtie2 not bowtie1. So, I thought your tophat used bowtie1 as bowtie2. Sorry for my limited ability, I think you can submit this question to Tophat.