Open gustavo11 opened 7 years ago
Dear @LindseyBohr, sorry for the delay. I have uploaded a GFF format converter that might work in the conversion of prokka output to the format accepted by ProphET. Please see instruction on ProphET's README.md file. Please tell me in case it doesn't work and I will address the issue using a different strategy.
@gustavo11 Prokka author here :)
What gene model structure are you expecting in the GFF file?
GFF3 or GTF(2.5) ?
We are using the GFF3 format as defined here: https://github.com/The-Sequence-Ontology/Specifications/blob/master/gff3.md
Bacaterial GFF files do not usually use the full example gene model described in the GFF3 spec. I would suggest you support the Prokka and NCBI style GFF3 files.
The gff_rewrite.pl
tool works for most Prokka-generated gffs (although some I had to re-run without the --compliant
flag, as pipes in the fasta header were being interpreted as pipes in commands, I think around line 167). I tried to add a system call to to the main ProphET_standalone.pl
script when parsing the gff failed:
#Processing the input files and separating in one fasta per GFF
# Get the scaffold IDs from the gff
my $gff_handler = GFFFile::new($gff_in);
try {
$gff_handler->read();
} catch {
warn "caught error: $_";
my @args = ("$UTILS_DIR/GFFLib/gff_rewrite.pl", "--input", "$gff_in", "--output", "$oudir/tmp.gff", "--add_missing_features");
system(@args) == 0
or die "system @args failed: $?";
my $gff_handler = GFFFile::new("$oudir/tmp.gff");
$gff_handler->read();
};
my @scaffold_ids = $gff_handler->get_chrom_names();
but I got the Can't locate GFFFile.pm in @INC
error when it tries to run gff_rewrite.pl
, despite the use lib "$FindBin::Bin/UTILS.dir/GFFLib"
line up top. Any tips on how I could get this sorted so I could submit a pull request?
@nickp60 Try exporting path before running the script like 'export PERL5LIB=$UTILS_DIR/GFFLib'
When I try to use prokka gff output currently, I get an error along the lines of
The transcript XXXX does not seem to have a parent
Is there any way for me to bypass or fix this?