nextgenusfs / funannotate

Eukaryotic Genome Annotation Pipeline
http://funannotate.readthedocs.io
BSD 2-Clause "Simplified" License
320 stars 84 forks source link

Re-naming fails #56

Closed MichaelFokinNZ closed 7 years ago

MichaelFokinNZ commented 7 years ago

Hi, sorry for bothering you again :) Could you pls advise me what to check to find the problem, re-naming (adding locus tag) works well with sample data, but not with the real one...

funannotate-predict.log.txt

nextgenusfs commented 7 years ago

What was the result? I don't immediately see an error in the log?

MichaelFokinNZ commented 7 years ago

the locus tag is just absent even default one... i cant see errors either, except one with the number of columns.

On 31/03/2017 3:03 pm, "Jon Palmer" notifications@github.com wrote:

What was the result? I don't immediately see an error in the log?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nextgenusfs/funannotate/issues/56#issuecomment-290595106, or mute the thread https://github.com/notifications/unsubscribe-auth/AHRafwNEZVoZvAuGVWRd8YvFYrKfxVj6ks5rrF7dgaJpZM4MvIeT .

nextgenusfs commented 7 years ago

The locus tag is absent from what? All of the gene models? What does the final output GFF3 file look like, can you do like a head -n 20 predict_results/name.gff3

MichaelFokinNZ commented 7 years ago

Yes, the gene models just dont have the locus tag upfront. Sorry, will be able to access my computer only in 2 hours, will send you an example asap. Any other intermediate files?

On 31/03/2017 3:10 pm, "Jon Palmer" notifications@github.com wrote:

The locus tag is absent from what? All of the gene models? What does the final output GFF3 file look like, can you do like a head -n 20 predict_results/name.gff3

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nextgenusfs/funannotate/issues/56#issuecomment-290596034, or mute the thread https://github.com/notifications/unsubscribe-auth/AHRaf3m7XqZIQdlsJ41ueUOlgPaVD8Edks5rrGCNgaJpZM4MvIeT .

nextgenusfs commented 7 years ago

Just the first 100 lines of the GFF3 output as well as the GBK output so I can understand what is happening. I think tbl2asn would crash if there wasn't locus_tag values.

MichaelFokinNZ commented 7 years ago

Not completely :) it just gives thousands of fatal errors.

On 31/03/2017 3:16 pm, "Jon Palmer" notifications@github.com wrote:

Just the first 100 lines of the GFF3 output as well as the GBK output so I can understand what is happening. I think tbl2asn would crash if there wasn't locus_tag values.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nextgenusfs/funannotate/issues/56#issuecomment-290596900, or mute the thread https://github.com/notifications/unsubscribe-auth/AHRaf7_4qJ-zlA5E-89Bb-vMMkZkHZlaks5rrGICgaJpZM4MvIeT .

MichaelFokinNZ commented 7 years ago

sorry, actually locus tag was added to genes and mRNA names, but not to cds and exons (is it needed there?) So the final discrepancy report is kind of ok, but tbl2asn report is crazy. Pls find both attached.

Also

nextgenusfs commented 7 years ago

Everything is working the way it is suppose to. Right now you can't change the weights, the PASA weighting suggestions is from Brian Haas who wrote both EvidenceModeler/PASA/Trinity. The GAG "exon limit" is not what funanntoate is passing to the script, it is an intron limit - so I think the stdout message is wrong from GAG here. In the upcoming release I've gotten rid of this argument as it doesn't seem to make a difference once way or another, EVM will enforce the max intron length limit. The tbl2asn discrepancy report is from the first prediction step - basically a "dry-run" of tbl2asn - then the scripts parse the error report, change gene names, and fixes any gene models that it can. You should only be concerned with the discrepancy report in the predict_results folder. I've also fixed some of the tRNA filtering in upcoming release so you shouldn't get that 'tRNA too long error'.