Closed nam-hoang closed 8 months ago
it needs to be gff3 - this is generated by a script within funannotate from the zff that comes from snap
@nextgenusfs can you remember where that conversion comes- I cannot find a suitable funannotate util
that would serve this purpose.
Can one instead just copy the .zff file into the predict_misc folder and then I think might trigger regeneration of the gff3 without trying to run the snap step? I don't quite know.
here's a working snap-predictions.gff3
:
##gff-version 3
scaffold_1 snap gene 2772 4405 . + . ID=scaffold_1-snap.1;
scaffold_1 snap mRNA 2772 4405 . + . ID=scaffold_1-snap.1-T1;Parent=scaffold_1-snap.1;product=[];
scaffold_1 snap exon 2772 2780 . + . ID=scaffold_1-snap.1-T1.exon1;Parent=scaffold_1-snap.1-T1;
scaffold_1 snap exon 2865 2913 . + . ID=scaffold_1-snap.1-T1.exon2;Parent=scaffold_1-snap.1-T1;
scaffold_1 snap exon 3000 3068 . + . ID=scaffold_1-snap.1-T1.exon3;Parent=scaffold_1-snap.1-T1;
scaffold_1 snap exon 3971 4036 . + . ID=scaffold_1-snap.1-T1.exon4;Parent=scaffold_1-snap.1-T1;
scaffold_1 snap exon 4359 4405 . + . ID=scaffold_1-snap.1-T1.exon5;Parent=scaffold_1-snap.1-T1;
scaffold_1 snap CDS 2772 2780 . + 0 ID=scaffold_1-snap.1-T1.cds;Parent=scaffold_1-snap.1-T1;
scaffold_1 snap CDS 2865 2913 . + 0 ID=scaffold_1-snap.1-T1.cds;Parent=scaffold_1-snap.1-T1;
scaffold_1 snap CDS 3000 3068 . + 2 ID=scaffold_1-snap.1-T1.cds;Parent=scaffold_1-snap.1-T1;
scaffold_1 snap CDS 3971 4036 . + 2 ID=scaffold_1-snap.1-T1.cds;Parent=scaffold_1-snap.1-T1;
scaffold_1 snap CDS 4359 4405 . + 2 ID=scaffold_1-snap.1-T1.cds;Parent=scaffold_1-snap.1-T1;
Hi Jason, @hyphaltip
Thank you so much for your suggestions. I copied the file snap-predictions.zff
to predict-misc
folder, however, funannotate seems to only check if snap-predictions.gff3
is there, otherwise, it will start snap training, and rewrite all files.
So, to have a snap-predictions.gff3
that can be recognized by EVM at this step, I found 2 perl scripts from EVM that can convert the SNAP gff format to EVM gff3 format https://github.com/EVidenceModeler/EVidenceModeler/tree/master/EvmUtils/misc.
(1) using SNAP_CDS_to_gff3.pl
./zff2gff3.pl snap-predictions.zff > snap-predictions_CDS.gff3
./SNAP_CDS_to_gff3.pl snap-predictions_CDS.gff3 > snap-predictions_CDSformat_4EVM.gff3
(2) SNAP_ExonEtermEinitEsngl_gff_to_gff3.pl
./snap -gff snap-trained.hmm genome.softmasked.fa > snap-predictions_SNAPformat.gff3
./SNAP_ExonEtermEinitEsngl_gff_to_gff3.pl snap-predictions_SNAPformat.gff3 > snap-predictions_SNAPformat_4EVM.gff3
After converting, I copied either or these two to predict-misc
folder, renamed the file to snap-predictions.gff3
, and it works.
Let me know what you think about this. Thank you very much.
Best regards, Nam
great - I don't remember the internal steps to converting zff to gff3 within funannotate - there is python code to do it. butam glad you have this fixed. I will see if we can expose this conversion step as a util
option in funannotate in future.
Hi Funannotate Team, Thanks for an amazing tool.
I am running funannotate v1.8.15 installed from mamba, and everything looks great except one issue that I can't seem to get SNAP prediction to work properly within the pipeline. I keep getting "0 predictions from SNAP", a snap-predictions.gff3 was created, but empty with only one header line ##gff-version 3.
I already tried the suggestions in #386 by copy forge from github version to my conda environment, but it did not solve the problem. So, I tried to train SNAP outside and attempted to copy the snap-predictions.gff3 to folder predict_misc/. When I ran it, the pipeline did recognize the gff3 file, i.e., Existing snap predictions found /predict_misc/snap-predictions.gff3". But here again, I still end up "0 predictions from SNAP". I generated this snap-predictions.gff3 by running
snap -gff snap-trained.hmm genome.softmasked.fa
, and my gff3 file is ~37 Mb in size and looks like the below:In the log file, I only found the command
snap snap-trained.hmm genome.softmasked.fa
without -gff option. Also mentioned here https://github.com/nextgenusfs/funannotate/issues/386#issuecomment-591629954, so I wonder if what I did was correct.Could you please advise me if this gff3 file looks as expected? Or how to get the right GFF3 format for the pipeline?
Thank you very much. Best regards, Nam