Closed el42008 closed 7 years ago
Thank you for your answer @nextgenusfs, even for the annotate module the headers of the fasta files must not exceed 16 characters, is that right? So it does not take into account the space between the identifier and the description?
If you aren't submitting to NCBI you can pass --header_length 20
or whatever length you need. Basically if the header gets too long than it doesn't fit in NCBI genbank format and runs into the next column (that is at least my understanding of the character limit).
Thanks for your answer @nextgenusfs ,What do you mean by "submitting to NCBI"? I just want to do the functional annotation with annotate. On the documentation, there is no "--header_length " parameter.
Should I pass it in the annotate command line, i.e. like this
funannotate annotate --gff "genom.gff3" --fasta "genom.fasta" --header_length 100 --iprscan "file.xml" -d "path_FUNANNOTATE_DB" --species "Fusarium oxysporum" --cpus 10 -o "annoted" ?
Because when I run this I get an error like this:
[Mar 02 06:47 PM]: CMD ERROR: /shared/home/.local/lib/python3.9/site-packages/funannotate/aux_scripts/funannotate-BUSCO2.py -i /shared/home/projects/annotation_fusarium/Funannotate/annoted/annotate_misc/genome.proteins.fa -m proteins -l /shared/home/projects/annotation_fusarium/Funannotate/FUNANNOTATE_DB/dikarya -o busco -c 10 -f
[Mar 02 06:47 PM]: AUGUSTUS_CONFIG_PATH environmental variable not set, exiting
The error message indicates the augustus AUGUSTUS_CONFIG_PATH environment variable is not set.
The funannotate read the docs documention is a good place to start reading how to run the tool if you haven’t had a chance to look over that.
Header length of 100 seems pretty excessive ??
On Thu, Mar 2, 2023 at 1:05 PM MesYosra @.***> wrote:
Should I pass it in the annotate command line, i.e. like this
funannotate annotate --gff "genom.gff3" --fasta "genom.fasta" --header_length 100 --iprscan "file.xml" -d "path_FUNANNOTATE_DB" --species "Fusarium oxysporum" --cpus 10 -o "annoted" ?
Because when I run this I get an error like this:
[Mar 02 06:47 PM]: CMD ERROR: /shared/home/.local/lib/python3.9/site-packages/funannotate/aux_scripts/funannotate-BUSCO2.py -i /shared/home/projects/annotation_fusarium/Funannotate/annoted/annotate_misc/genome.proteins.fa -m proteins -l /shared/home/projects/annotation_fusarium/Funannotate/FUNANNOTATE_DB/dikarya -o busco -c 10 -f [Mar 02 06:47 PM]: AUGUSTUS_CONFIG_PATH environmental variable not set, exiting
— Reply to this email directly, view it on GitHub https://github.com/nextgenusfs/funannotate/issues/59#issuecomment-1452544384, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAL5O2CGOWZFKAYVOH3OWTW2EDR7ANCNFSM4DGICCBA . You are receiving this because you commented.Message ID: @.***>
-- Sent from Gmail Mobile
Jason Stajich - @.***
Most researchers deposit genomes into the public database genbank. Eg the NCBI version of genbank. This was a primary motivation for funannotate to improve the workflows for that.
On Thu, Mar 2, 2023 at 10:07 AM MesYosra @.***> wrote:
Thanks for your answer @nextgenusfs https://github.com/nextgenusfs ,What do you mean by "submitting to NCBI"? I just want to do the functional annotation with annotate. On the documentation, there is no "--header_length " parameter.
— Reply to this email directly, view it on GitHub https://github.com/nextgenusfs/funannotate/issues/59#issuecomment-1452310480, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAL5O5MESLNUF76TDEM4PLW2DOUVANCNFSM4DGICCBA . You are receiving this because you commented.Message ID: @.***>
-- Sent from Gmail Mobile
Jason Stajich - @.***
The error is because it made it thru the first screening and the header name isn’t too long. Now it hits another error. Augustus us used in Busco which is run in annotate. just set the env variable as was done to complete the predict step.
On Fri, Mar 3, 2023 at 12:35 AM MesYosra @.***> wrote:
Thank you for your answer @hyphaltip https://github.com/hyphaltip, but yes I read the documentation. When I ran without " --header_length" I got an error message that the fasta file header exceeds 16 characters. the headers of my fasta file to make funannotate annotate exceeds 16 characters ( identifier + description) . So @nextgenusfs https://github.com/nextgenusfs suggested to add the parameter " --header_length" (which does not exist in the documentation but in the code). But when I do that I get the error with the augustus environment variable.I don't understand the error because augustus is supposed to be used in the predict module (and not annotate). Thanks
— Reply to this email directly, view it on GitHub https://github.com/nextgenusfs/funannotate/issues/59#issuecomment-1453162961, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAL5O3LYRTGIHUWERHMJTTW2GUORANCNFSM4DGICCBA . You are receiving this because you were mentioned.Message ID: @.***>
-- Sent from Gmail Mobile
Jason Stajich - @.***
Hi,
I got this issue and I wonder if you know what it might be the problem. I think it is something to do with headers because they might be longer than 16. Is there anyway to solve this without having to get new bam files?
Traceback (most recent call last): File "/mnt/apps/funannotate/bin/funannotate-predict.py", line 223, in
if not lib.BamHeaderTest(args.input, args.rna_bam):
File "/mnt/apps/funannotate/lib/library.py", line 457, in BamHeaderTest
bam_file = pybam.bgunzip(bamin)
File "/mnt/apps/funannotate/lib/pybam.py", line 88, in init
self.header_text = struct.unpack(str(length_of_header)+'s',first_chunk[8:8+length_of_header])[0]
struct.error: unpack requires a string argument of length 4908302
Thanks a lot
Elena