AlexanderLabWHOI / EUKulele

Automatic eukaryotic taxonomic classification
MIT License
28 stars 7 forks source link

Issues running EUKulele with .fnn file #37

Closed sgleich closed 2 years ago

sgleich commented 3 years ago

Hi! I've been using GeneMark for metatranscriptome protein prediction. The outputs I get from the GeneMark program are a .faa file (protein seqs) and a .fnn file (nucleotide seqs). It seems like EUKulele doesn't like the .fnn files. This is the command I am running and the associated error message I'm getting:

EUKulele --sample_dir path/to/metatranscriptome/samples -m mets Running EUKulele with command line arguments, as no valid configuration file was provided. Setting things up... [] No samples found in sample directory with specified nucleotide or peptide extension.

(The directory "samples" has my GeneMark .fnn file in it)

Any information on how to resolve this issue is greatly appreciated!

Thank you!

akrinos commented 3 years ago

Hi @sgleich ! We do recommend running EUKulele with the protein output, since EUKulele will translate back to protein sequences under the hood, which substantially increases runtime. However, to get EUKulele to accept your nucleotide sequences, you can use the flag:

--nucleotide_extension or --n_ext

With an argument of .fnn (hence --n_ext fnn). And to specifically specify .faa as the protein extension, you could also use --protein_extension .faa. Hope this sorts things out! Always happy to help with questions.