barricklab / breseq

breseq is a computational pipeline for finding mutations relative to a reference sequence in short-read DNA resequencing data. It is intended for haploid microbial genomes (<20 Mb). breseq is a command line tool implemented in C++ and R.
http://barricklab.org/breseq
GNU General Public License v2.0
137 stars 21 forks source link

gdtools (v0.37.1) still have a problem of the difference between Locus and feature in the genebank file. #331

Closed ihara920 closed 1 year ago

ihara920 commented 1 year ago

At the update of breseq v0.36.1, the problem in loading GenBank files that have a LOCUS line and source feature with different lengths was completely resolved. And Breseq 0.37.1 work well at this point. However, gdtools in this same version produce a warning like this way

Begin ANNOTATE/COMPARE

Reading input reference sequence files
    /home/DATA/Ref/Salmonella_enterica.gbk

----------------------------------> WARNING <----------------------------------- Length assigned to sequence 'NC_003197' from LOCUS line (4857450) does not match length previously assigned from source feature (1004278). The larger of the two lengths will be used. If you encounter further errors, make sure LOCUS lengths match the true lengths of your DNA sequences.

----------------------------------> WARNING <----------------------------------- Length assigned to sequence 'NC_003197' from LOCUS line (4857450) does not match length previously assigned from source feature (1143702). The larger of the two lengths will be used. If you encounter further errors, make sure LOCUS lengths match the true lengths of your DNA sequences.

----------------------------------> WARNING <----------------------------------- Length assigned to sequence 'NC_003197' from LOCUS line (4857450) does not match length previously assigned from source feature (2776825). The larger of the two lengths will be used. If you encounter further errors, make sure LOCUS lengths match the true lengths of your DNA sequences.

----------------------------------> WARNING <----------------------------------- Length assigned to sequence 'NC_003197' from LOCUS line (4857450) does not match length previously assigned from source feature (2879237). The larger of the two lengths will be used. If you encounter further errors, make sure LOCUS lengths match the true lengths of your DNA sequences.

Reading input GD file: upx2-1.gd

Reading input GD file: upx2-10.gd

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!> FATAL ERROR <!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Could not open file for reading: upx2-10.gd FILE: genome_diff.cpp LINE: 59 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!> STACK TRACE <!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Backtrace with 8 stack frames. gdtools(+0x352a9) [0x55fd147fe2a9] gdtools(+0xc09d3) [0x55fd148899d3] gdtools(+0xc1f48) [0x55fd1488af48] gdtools(+0x7973a) [0x55fd1484273a] gdtools(+0x7b119) [0x55fd14844119] gdtools(+0x28dc4) [0x55fd147f1dc4] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x7fb4e7e46083] gdtools(+0x32b69) [0x55fd147fbb69] !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Please tell me a way to resolve this problem.

jeffreybarrick commented 1 year ago

This error is saying that the file upx2-10.gd isn't found. Check that it exists. Maybe you used a hyphen instead of a dash or vice versa. The warnings are giving you information about what breseq is doing, but they aren't causing it to stop.