nextgenusfs / funannotate

Eukaryotic Genome Annotation Pipeline
http://funannotate.readthedocs.io
BSD 2-Clause "Simplified" License
321 stars 85 forks source link

issue with headers? #59

Closed el42008 closed 7 years ago

el42008 commented 7 years ago

Hi,

I got this issue and I wonder if you know what it might be the problem. I think it is something to do with headers because they might be longer than 16. Is there anyway to solve this without having to get new bam files?

Traceback (most recent call last): File "/mnt/apps/funannotate/bin/funannotate-predict.py", line 223, in if not lib.BamHeaderTest(args.input, args.rna_bam): File "/mnt/apps/funannotate/lib/library.py", line 457, in BamHeaderTest bam_file = pybam.bgunzip(bamin) File "/mnt/apps/funannotate/lib/pybam.py", line 88, in init self.header_text = struct.unpack(str(length_of_header)+'s',first_chunk[8:8+length_of_header])[0] struct.error: unpack requires a string argument of length 4908302

Thanks a lot

Elena

MesYosra commented 1 year ago

Thank you for your answer @nextgenusfs, even for the annotate module the headers of the fasta files must not exceed 16 characters, is that right? So it does not take into account the space between the identifier and the description?

nextgenusfs commented 1 year ago

If you aren't submitting to NCBI you can pass --header_length 20 or whatever length you need. Basically if the header gets too long than it doesn't fit in NCBI genbank format and runs into the next column (that is at least my understanding of the character limit).

MesYosra commented 1 year ago

Thanks for your answer @nextgenusfs ,What do you mean by "submitting to NCBI"? I just want to do the functional annotation with annotate. On the documentation, there is no "--header_length " parameter.

MesYosra commented 1 year ago

Should I pass it in the annotate command line, i.e. like this

funannotate annotate --gff "genom.gff3" --fasta "genom.fasta" --header_length 100 --iprscan "file.xml" -d "path_FUNANNOTATE_DB" --species "Fusarium oxysporum" --cpus 10 -o "annoted" ?

Because when I run this I get an error like this:

[Mar 02 06:47 PM]: CMD ERROR: /shared/home/.local/lib/python3.9/site-packages/funannotate/aux_scripts/funannotate-BUSCO2.py -i /shared/home/projects/annotation_fusarium/Funannotate/annoted/annotate_misc/genome.proteins.fa -m proteins -l /shared/home/projects/annotation_fusarium/Funannotate/FUNANNOTATE_DB/dikarya -o busco -c 10 -f

[Mar 02 06:47 PM]: AUGUSTUS_CONFIG_PATH environmental variable not set, exiting

hyphaltip commented 1 year ago

The error message indicates the augustus AUGUSTUS_CONFIG_PATH environment variable is not set.

The funannotate read the docs documention is a good place to start reading how to run the tool if you haven’t had a chance to look over that.

Header length of 100 seems pretty excessive ??

On Thu, Mar 2, 2023 at 1:05 PM MesYosra @.***> wrote:

Should I pass it in the annotate command line, i.e. like this

funannotate annotate --gff "genom.gff3" --fasta "genom.fasta" --header_length 100 --iprscan "file.xml" -d "path_FUNANNOTATE_DB" --species "Fusarium oxysporum" --cpus 10 -o "annoted" ?

Because when I run this I get an error like this:

[Mar 02 06:47 PM]: CMD ERROR: /shared/home/.local/lib/python3.9/site-packages/funannotate/aux_scripts/funannotate-BUSCO2.py -i /shared/home/projects/annotation_fusarium/Funannotate/annoted/annotate_misc/genome.proteins.fa -m proteins -l /shared/home/projects/annotation_fusarium/Funannotate/FUNANNOTATE_DB/dikarya -o busco -c 10 -f [Mar 02 06:47 PM]: AUGUSTUS_CONFIG_PATH environmental variable not set, exiting

— Reply to this email directly, view it on GitHub https://github.com/nextgenusfs/funannotate/issues/59#issuecomment-1452544384, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAL5O2CGOWZFKAYVOH3OWTW2EDR7ANCNFSM4DGICCBA . You are receiving this because you commented.Message ID: @.***>

-- Sent from Gmail Mobile

Jason Stajich - @.***

hyphaltip commented 1 year ago

Most researchers deposit genomes into the public database genbank. Eg the NCBI version of genbank. This was a primary motivation for funannotate to improve the workflows for that.

On Thu, Mar 2, 2023 at 10:07 AM MesYosra @.***> wrote:

Thanks for your answer @nextgenusfs https://github.com/nextgenusfs ,What do you mean by "submitting to NCBI"? I just want to do the functional annotation with annotate. On the documentation, there is no "--header_length " parameter.

— Reply to this email directly, view it on GitHub https://github.com/nextgenusfs/funannotate/issues/59#issuecomment-1452310480, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAL5O5MESLNUF76TDEM4PLW2DOUVANCNFSM4DGICCBA . You are receiving this because you commented.Message ID: @.***>

-- Sent from Gmail Mobile

Jason Stajich - @.***

hyphaltip commented 1 year ago

The error is because it made it thru the first screening and the header name isn’t too long. Now it hits another error. Augustus us used in Busco which is run in annotate. just set the env variable as was done to complete the predict step.

On Fri, Mar 3, 2023 at 12:35 AM MesYosra @.***> wrote:

Thank you for your answer @hyphaltip https://github.com/hyphaltip, but yes I read the documentation. When I ran without " --header_length" I got an error message that the fasta file header exceeds 16 characters. the headers of my fasta file to make funannotate annotate exceeds 16 characters ( identifier + description) . So @nextgenusfs https://github.com/nextgenusfs suggested to add the parameter " --header_length" (which does not exist in the documentation but in the code). But when I do that I get the error with the augustus environment variable.I don't understand the error because augustus is supposed to be used in the predict module (and not annotate). Thanks

— Reply to this email directly, view it on GitHub https://github.com/nextgenusfs/funannotate/issues/59#issuecomment-1453162961, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAL5O3LYRTGIHUWERHMJTTW2GUORANCNFSM4DGICCBA . You are receiving this because you were mentioned.Message ID: @.***>

-- Sent from Gmail Mobile

Jason Stajich - @.***