xiaofengsong / GFusion

GFusion is a software package to detect fusion genes using RNA-Seq data
6 stars 7 forks source link

Cannot find accepted_hits.bam #1

Open murphycj opened 7 years ago

murphycj commented 7 years ago

I' am trying test GFusion on a small dataset. The reference genome sequences I've downloaded is hg19 from http://ccb.jhu.edu/software/tophat/igenomes.shtml.

The command I' am using:

~/chm2059/lib/perl-5.24.1/bin/perl GFusion-master/GFusion.pl -o test -p 4 -i ./Homo_sapiens/UCSC/hg19/Sequence/BowtieIndex/genome -g ./Homo_sapiens/UCSC/hg19/Annotation/Genes/genes.gtf -1 ../data/reads_1.fastq -2 ../data/reads_2.fastq

This is the log output. It seems to end in error.

[Sat Aug 26 11:23:50 2017]

[2017-08-26 11:23:50] Beginning TopHat run (v2.0.13)
-----------------------------------------------
[2017-08-26 11:23:50] Checking for Bowtie
          Bowtie version:    1.1.1.0
[2017-08-26 11:23:50] Checking for Bowtie index files (genome)..
[2017-08-26 11:23:50] Checking for reference FASTA file
[2017-08-26 11:23:50] Generating SAM header for ./Homo_sapiens/UCSC/hg19/Sequence/BowtieIndex/genome
[2017-08-26 11:23:52] Preparing reads
     left reads: min. length=92, max. length=100, 906 kept reads (0 discarded)
    right reads: min. length=92, max. length=100, 906 kept reads (0 discarded)
[2017-08-26 11:23:52] Mapping left_kept_reads to genome genome with Bowtie 
[2017-08-26 11:23:53] Mapping left_kept_reads_seg1 to genome genome with Bowtie (1/4)
[2017-08-26 11:23:55] Mapping left_kept_reads_seg2 to genome genome with Bowtie (2/4)
[2017-08-26 11:23:57] Mapping left_kept_reads_seg3 to genome genome with Bowtie (3/4)
[2017-08-26 11:23:59] Mapping left_kept_reads_seg4 to genome genome with Bowtie (4/4)
[2017-08-26 11:24:01] Mapping right_kept_reads to genome genome with Bowtie 
[2017-08-26 11:24:02] Mapping right_kept_reads_seg1 to genome genome with Bowtie (1/4)
[2017-08-26 11:24:04] Mapping right_kept_reads_seg2 to genome genome with Bowtie (2/4)
[2017-08-26 11:24:05] Mapping right_kept_reads_seg3 to genome genome with Bowtie (3/4)
[2017-08-26 11:24:07] Mapping right_kept_reads_seg4 to genome genome with Bowtie (4/4)
[2017-08-26 11:24:08] Searching for junctions via segment mapping
[2017-08-26 11:25:24] Retrieving sequences for splices
[2017-08-26 11:26:53] Indexing splices
[2017-08-26 11:26:54] Mapping left_kept_reads_seg1 to genome segment_juncs with Bowtie (1/4)
[2017-08-26 11:26:54] Mapping left_kept_reads_seg2 to genome segment_juncs with Bowtie (2/4)
[2017-08-26 11:26:54] Mapping left_kept_reads_seg3 to genome segment_juncs with Bowtie (3/4)
[2017-08-26 11:26:54] Mapping left_kept_reads_seg4 to genome segment_juncs with Bowtie (4/4)
[2017-08-26 11:26:54] Joining segment hits
[2017-08-26 11:28:02] Mapping right_kept_reads_seg1 to genome segment_juncs with Bowtie (1/4)
[2017-08-26 11:28:02] Mapping right_kept_reads_seg2 to genome segment_juncs with Bowtie (2/4)
[2017-08-26 11:28:03] Mapping right_kept_reads_seg3 to genome segment_juncs with Bowtie (3/4)
[2017-08-26 11:28:03] Mapping right_kept_reads_seg4 to genome segment_juncs with Bowtie (4/4)
[2017-08-26 11:28:03] Joining segment hits
[2017-08-26 11:29:10] Reporting output tracks
-----------------------------------------------
[2017-08-26 11:30:24] A summary of the alignment counts can be found in test/align_summary.txt
[2017-08-26 11:30:24] Run complete: 00:06:33 elapsed
[Sat Aug 26 11:30:24 2017]
# reads processed: 1124
# reads with at least one reported alignment: 768 (68.33%)
# reads that failed to align: 356 (31.67%)
Reported 768 alignments to 1 output stream(s)
[samopen] SAM header is present: 25 sequences.
[samopen] SAM header is present: 25 sequences.
[samopen] SAM header is present: 25 sequences.
Settings:
  Output files: "test/fusion_out/ref/index/re.*.ebwt"
  Line rate: 6 (line is 64 bytes)
  Lines per side: 1 (side is 64 bytes)
  Offset rate: 5 (one in 32)
  FTable chars: 10
  Strings: unpacked
  Max bucket size: default
  Max bucket size, sqrt multiplier: default
  Max bucket size, len divisor: 4
  Difference-cover sample period: 1024
  Endianness: little
  Actual local endianness: little
  Sanity checking: disabled
  Assertions: disabled
  Random seed: 0
  Sizeofs: void*:8, int:4, long:8, size_t:8
Input files DNA, FASTA:
  test/fusion_out/ref/index/re.fa
Reading reference sizes
  Time reading reference sizes: 00:00:00
Calculating joined length
Writing header
Reserving space for joined string
Joining reference sequences
  Time to join reference sequences: 00:00:00
bmax according to bmaxDivN setting: 19000
Using parameters --bmax 14250 --dcv 1024
  Doing ahead-of-time memory usage test
  Passed!  Constructing with these parameters: --bmax 14250 --dcv 1024
Constructing suffix-array element generator
Building DifferenceCoverSample
  Building sPrime
  Building sPrimeOrder
  V-Sorting samples
  V-Sorting samples time: 00:00:00
  Allocating rank array
  Ranking v-sort output
  Ranking v-sort output time: 00:00:00
  Invoking Larsson-Sadakane on ranks
  Invoking Larsson-Sadakane on ranks time: 00:00:00
  Sanity-checking and returning
Building samples
Reserving space for 12 sample suffixes
Generating random suffixes
QSorting 12 sample offsets, eliminating duplicates
QSorting sample offsets, eliminating duplicates time: 00:00:00
Multikey QSorting 12 samples
  (Using difference cover)
  Multikey QSorting samples time: 00:00:00
Calculating bucket sizes
  Binary sorting into buckets
  10%
  20%
  30%
  40%
  50%
  60%
  70%
  80%
  90%
  100%
  Binary sorting into buckets time: 00:00:00
Splitting and merging
  Splitting and merging time: 00:00:00
Split 2, merged 6; iterating...
  Binary sorting into buckets
  10%
  20%
  30%
  40%
  50%
  60%
  70%
  80%
  90%
  100%
  Binary sorting into buckets time: 00:00:00
Splitting and merging
  Splitting and merging time: 00:00:00
Avg bucket size: 9499.25 (target: 14249)
Converting suffix-array elements to index image
Allocating ftab, absorbFtab
Entering Ebwt loop
Getting block 1 of 8
  Reserving size (14250) for bucket
  Calculating Z arrays
  Calculating Z arrays time: 00:00:00
  Entering block accumulator loop:
  10%
  20%
  30%
  40%
  50%
  60%
  70%
  80%
  90%
  100%
  Block accumulator loop time: 00:00:00
  Sorting block of length 3825
  (Using difference cover)
  Sorting block time: 00:00:00
Returning block of 3826
Getting block 2 of 8
  Reserving size (14250) for bucket
  Calculating Z arrays
  Calculating Z arrays time: 00:00:00
  Entering block accumulator loop:
  10%
  20%
  30%
  40%
  50%
  60%
  70%
  80%
  90%
  100%
  Block accumulator loop time: 00:00:00
  Sorting block of length 13084
  (Using difference cover)
  Sorting block time: 00:00:00
Returning block of 13085
Getting block 3 of 8
  Reserving size (14250) for bucket
  Calculating Z arrays
  Calculating Z arrays time: 00:00:00
  Entering block accumulator loop:
  10%
  20%
  30%
  40%
  50%
  60%
  70%
  80%
  90%
  100%
  Block accumulator loop time: 00:00:00
  Sorting block of length 6176
  (Using difference cover)
  Sorting block time: 00:00:00
Returning block of 6177
Getting block 4 of 8
  Reserving size (14250) for bucket
  Calculating Z arrays
  Calculating Z arrays time: 00:00:00
  Entering block accumulator loop:
  10%
  20%
  30%
  40%
  50%
  60%
  70%
  80%
  90%
  100%
  Block accumulator loop time: 00:00:00
  Sorting block of length 8401
  (Using difference cover)
  Sorting block time: 00:00:00
Returning block of 8402
Getting block 5 of 8
  Reserving size (14250) for bucket
  Calculating Z arrays
  Calculating Z arrays time: 00:00:00
  Entering block accumulator loop:
  10%
  20%
  30%
  40%
  50%
  60%
  70%
  80%
  90%
  100%
  Block accumulator loop time: 00:00:00
  Sorting block of length 12968
  (Using difference cover)
  Sorting block time: 00:00:00
Returning block of 12969
Getting block 6 of 8
  Reserving size (14250) for bucket
  Calculating Z arrays
  Calculating Z arrays time: 00:00:00
  Entering block accumulator loop:
  10%
  20%
  30%
  40%
  50%
  60%
  70%
  80%
  90%
  100%
  Block accumulator loop time: 00:00:00
  Sorting block of length 4827
  (Using difference cover)
  Sorting block time: 00:00:00
Returning block of 4828
Getting block 7 of 8
  Reserving size (14250) for bucket
  Calculating Z arrays
  Calculating Z arrays time: 00:00:00
  Entering block accumulator loop:
  10%
  20%
  30%
  40%
  50%
  60%
  70%
  80%
  90%
  100%
  Block accumulator loop time: 00:00:00
  Sorting block of length 14099
  (Using difference cover)
  Sorting block time: 00:00:00
Returning block of 14100
Getting block 8 of 8
  Reserving size (14250) for bucket
  Calculating Z arrays
  Calculating Z arrays time: 00:00:00
  Entering block accumulator loop:
  10%
  20%
  30%
  40%
  50%
  60%
  70%
  80%
  90%
  100%
  Block accumulator loop time: 00:00:00
  Sorting block of length 12614
  (Using difference cover)
  Sorting block time: 00:00:00
Returning block of 12615
Exited Ebwt loop
fchr[A]: 0
fchr[C]: 20142
fchr[G]: 36763
fchr[T]: 54166
fchr[$]: 76001
Exiting Ebwt::buildToDisk()
Returning from initFromVector
Wrote 4216280 bytes to primary EBWT file: test/fusion_out/ref/index/re.1.ebwt
Wrote 9508 bytes to secondary EBWT file: test/fusion_out/ref/index/re.2.ebwt
Re-opening _in1 and _in2 as input streams
Returning from Ebwt constructor
Headers:
    len: 76001
    bwtLen: 76002
    sz: 19001
    bwtSz: 19001
    lineRate: 6
    linesPerSide: 1
    offRate: 5
    offMask: 0xffffffe0
    isaRate: -1
    isaMask: 0xffffffff
    ftabChars: 10
    eftabLen: 20
    eftabSz: 80
    ftabLen: 1048577
    ftabSz: 4194308
    offsLen: 2376
    offsSz: 9504
    isaLen: 0
    isaSz: 0
    lineSz: 64
    sideSz: 64
    sideBwtSz: 56
    sideBwtLen: 224
    numSidePairs: 170
    numSides: 340
    numLines: 340
    ebwtTotLen: 21760
    ebwtTotSz: 21760
    reverse: 0
Total time for call to driver() for forward index: 00:00:00
Reading reference sizes
  Time reading reference sizes: 00:00:00
Calculating joined length
Writing header
Reserving space for joined string
Joining reference sequences
  Time to join reference sequences: 00:00:00
bmax according to bmaxDivN setting: 19000
Using parameters --bmax 14250 --dcv 1024
  Doing ahead-of-time memory usage test
  Passed!  Constructing with these parameters: --bmax 14250 --dcv 1024
Constructing suffix-array element generator
Building DifferenceCoverSample
  Building sPrime
  Building sPrimeOrder
  V-Sorting samples
  V-Sorting samples time: 00:00:00
  Allocating rank array
  Ranking v-sort output
  Ranking v-sort output time: 00:00:00
  Invoking Larsson-Sadakane on ranks
  Invoking Larsson-Sadakane on ranks time: 00:00:00
  Sanity-checking and returning
Building samples
Reserving space for 12 sample suffixes
Generating random suffixes
QSorting 12 sample offsets, eliminating duplicates
QSorting sample offsets, eliminating duplicates time: 00:00:00
Multikey QSorting 12 samples
  (Using difference cover)
  Multikey QSorting samples time: 00:00:00
Calculating bucket sizes
  Binary sorting into buckets
  10%
  20%
  30%
  40%
  50%
  60%
  70%
  80%
  90%
  100%
  Binary sorting into buckets time: 00:00:00
Splitting and merging
  Splitting and merging time: 00:00:00
Split 1, merged 5; iterating...
  Binary sorting into buckets
  10%
  20%
  30%
  40%
  50%
  60%
  70%
  80%
  90%
  100%
  Binary sorting into buckets time: 00:00:00
Splitting and merging
  Splitting and merging time: 00:00:00
Avg bucket size: 9499.25 (target: 14249)
Converting suffix-array elements to index image
Allocating ftab, absorbFtab
Entering Ebwt loop
Getting block 1 of 8
  Reserving size (14250) for bucket
  Calculating Z arrays
  Calculating Z arrays time: 00:00:00
  Entering block accumulator loop:
  10%
  20%
  30%
  40%
  50%
  60%
  70%
  80%
  90%
  100%
  Block accumulator loop time: 00:00:00
  Sorting block of length 4639
  (Using difference cover)
  Sorting block time: 00:00:00
Returning block of 4640
Getting block 2 of 8
  Reserving size (14250) for bucket
  Calculating Z arrays
  Calculating Z arrays time: 00:00:00
  Entering block accumulator loop:
  10%
  20%
  30%
  40%
  50%
  60%
  70%
  80%
  90%
  100%
  Block accumulator loop time: 00:00:00
  Sorting block of length 13015
  (Using difference cover)
  Sorting block time: 00:00:00
Returning block of 13016
Getting block 3 of 8
  Reserving size (14250) for bucket
  Calculating Z arrays
  Calculating Z arrays time: 00:00:00
  Entering block accumulator loop:
  10%
  20%
  30%
  40%
  50%
  60%
  70%
  80%
  90%
  100%
  Block accumulator loop time: 00:00:00
  Sorting block of length 9062
  (Using difference cover)
  Sorting block time: 00:00:00
Returning block of 9063
Getting block 4 of 8
  Reserving size (14250) for bucket
  Calculating Z arrays
  Calculating Z arrays time: 00:00:00
  Entering block accumulator loop:
  10%
  20%
  30%
  40%
  50%
  60%
  70%
  80%
  90%
  100%
  Block accumulator loop time: 00:00:00
  Sorting block of length 13949
  (Using difference cover)
  Sorting block time: 00:00:00
Returning block of 13950
Getting block 5 of 8
  Reserving size (14250) for bucket
  Calculating Z arrays
  Calculating Z arrays time: 00:00:00
  Entering block accumulator loop:
  10%
  20%
  30%
  40%
  50%
  60%
  70%
  80%
  90%
  100%
  Block accumulator loop time: 00:00:00
  Sorting block of length 12846
  (Using difference cover)
  Sorting block time: 00:00:00
Returning block of 12847
Getting block 6 of 8
  Reserving size (14250) for bucket
  Calculating Z arrays
  Calculating Z arrays time: 00:00:00
  Entering block accumulator loop:
  10%
  20%
  30%
  40%
  50%
  60%
  70%
  80%
  90%
  100%
  Block accumulator loop time: 00:00:00
  Sorting block of length 5615
  (Using difference cover)
  Sorting block time: 00:00:00
Returning block of 5616
Getting block 7 of 8
  Reserving size (14250) for bucket
  Calculating Z arrays
  Calculating Z arrays time: 00:00:00
  Entering block accumulator loop:
  10%
  20%
  30%
  40%
  50%
  60%
  70%
  80%
  90%
  100%
  Block accumulator loop time: 00:00:00
  Sorting block of length 12111
  (Using difference cover)
  Sorting block time: 00:00:00
Returning block of 12112
Getting block 8 of 8
  Reserving size (14250) for bucket
  Calculating Z arrays
  Calculating Z arrays time: 00:00:00
  Entering block accumulator loop:
  10%
  20%
  30%
  40%
  50%
  60%
  70%
  80%
  90%
  100%
  Block accumulator loop time: 00:00:00
  Sorting block of length 4757
  (Using difference cover)
  Sorting block time: 00:00:00
Returning block of 4758
Exited Ebwt loop
fchr[A]: 0
fchr[C]: 20142
fchr[G]: 36763
fchr[T]: 54166
fchr[$]: 76001
Exiting Ebwt::buildToDisk()
Returning from initFromVector
Wrote 4216280 bytes to primary EBWT file: test/fusion_out/ref/index/re.rev.1.ebwt
Wrote 9508 bytes to secondary EBWT file: test/fusion_out/ref/index/re.rev.2.ebwt
Re-opening _in1 and _in2 as input streams
Returning from Ebwt constructor
Headers:
    len: 76001
    bwtLen: 76002
    sz: 19001
    bwtSz: 19001
    lineRate: 6
    linesPerSide: 1
    offRate: 5
    offMask: 0xffffffe0
    isaRate: -1
    isaMask: 0xffffffff
    ftabChars: 10
    eftabLen: 20
    eftabSz: 80
    ftabLen: 1048577
    ftabSz: 4194308
    offsLen: 2376
    offsSz: 9504
    isaLen: 0
    isaSz: 0
    lineSz: 64
    sideSz: 64
    sideBwtSz: 56
    sideBwtLen: 224
    numSidePairs: 170
    numSides: 340
    numLines: 340
    ebwtTotLen: 21760
    ebwtTotSz: 21760
    reverse: 0
Total time for backward call to driver() for mirror index: 00:00:00

[2017-08-26 11:31:17] Beginning TopHat run (v2.0.13)
-----------------------------------------------
[2017-08-26 11:31:17] Checking for Bowtie
          Bowtie version:    1.1.1.0
[2017-08-26 11:31:17] Checking for Bowtie index files (genome)..
[2017-08-26 11:31:17] Checking for reference FASTA file
[2017-08-26 11:31:17] Generating SAM header for test/fusion_out/ref/index/re
Traceback (most recent call last):
  File "/home/chm2059/chm2059/lib/tophat-2.0.13.Linux_x86_64//tophat", line 4088, in <module>
    sys.exit(main())
  File "/home/chm2059/chm2059/lib/tophat-2.0.13.Linux_x86_64//tophat", line 3942, in main
    params.read_params = check_reads_format(params, reads_list)
  File "/home/chm2059/chm2059/lib/tophat-2.0.13.Linux_x86_64//tophat", line 1840, in check_reads_format
    freader=FastxReader(zf.file, params.read_params.color, zf.fname)
  File "/home/chm2059/chm2059/lib/tophat-2.0.13.Linux_x86_64//tophat", line 1585, in __init__
    while hlines>0 and self.lastline[0] not in "@>" :
IndexError: string index out of range
open: No such file or directory
[main_samview] fail to open "test/fusion_out/final/accepted_hits.bam" for reading.
[Sat Aug 26 11:31:18 2017] Completed successfully!  The time elapsed: about 0.12 hours.
latours commented 7 years ago

I am also encountering this problem except that it cannot find the index files for whatever reason

xiaofengsong commented 5 years ago

This problem may be due to the change in the usage of samtools. The '-o' is needed now when using 'samtools sort -n' command, and we have changed it. The index files cannot be found may be due to the "Bio::DB::IndexedBase", which would create an index file for the genome.fa in the same directory.