lczech / gappa

A toolkit for analyzing and visualizing phylogenetic (placement) data
GNU General Public License v3.0
56 stars 7 forks source link

Unexpected end of input file Error for "gappa examine assign" #16

Closed dhwanidesai closed 2 years ago

dhwanidesai commented 2 years ago

Hi Iczech, I was trying to assign taxonomy to sequences placed in a tree using the gappa examine assign command. Here is the complate command I used: gappa examine assign --jplace-path kapilireftree-placement-out/epa_result.jplace --taxon-file kapili-ppit-reference-taxonomy-forGappa-edited.txt --ranks-string "phylum|class|order|family|genus|species" --krona --file-prefix gappaClassification --resolve-missing-paths --verbose --log-file logfile.txt --allow-file-overwriting.

As mentioned in the gappa docs, the format of my taxonomy file is as follows:

Archaeoglobus_sp_--15299-16156-QNXS01000077_1-RUM33197_1    Archaea;Euryarchaeota;Archaeoglobi;Archaeoglobales;Archaeoglobaceae;Archaeoglobus
Candidatus_Methanolliviera_sp__GoM_oil--17968-18801-CABGHG010000016_1-VUT24248_1    Archaea;Euryarchaeota;Candidatus Methanoliparia;Candidatus  Methanoliparales;Candidatus Methanollivieraceae;
Candidatus_Methanolliviera_hydrocarbonicum--4870-5691-RXIL01000115_1-RZN68181_1 Archaea;Euryarchaeota;Candidatus Methanoliparia;Candidatus Methanoliparales;Candidatus Methanollivieraceae;Candidatus Methanolliviera
Candidatus_Methanolliviera_sp__GoM_oil--17980-18801-NZ_CABGHG010000016_1-WP_144144844_1 Archaea;Euryarchaeota;Candidatus Methanoliparia;Candidatus Methanoliparales;Candidatus Methanollivieraceae;
Candidatus_Methanolliviera_sp__GoM_asphalt--1732-2553-NZ_CABGHH010000049_1-WP_144183438_1   Archaea;Euryarchaeota;Candidatus Methanoliparia;Candidatus Methanoliparales;Candidatus Methanollivieraceae;
Candidatus_Methanolliviera_sp__GoM_asphalt--2351-3172-NZ_CABGHH010000072_1-WP_144183831_1   Archaea;Euryarchaeota;Candidatus Methanoliparia;Candidatus Methanoliparales;Candidatus Methanollivieraceae;
Methanobacterium_arcticum-M2-33199-34026-NZ_JQKN01000017_1-WP_048082155_1   Archaea;Euryarchaeota;Methanobacteria;Methanobacteriales;Methanobacteriaceae;Methanobacterium
Methanobacterium_bryantii-M_o_H_-83680-84507-NZ_LMVM01000041_1-WP_069582437_1   Archaea;Euryarchaeota;Methanobacteria;Methanobacteriales;Methanobacteriaceae;Methanobacterium
Methanobacterium_congolense--2009286-2010113-NZ_LT607756_1-WP_071907546_1   Archaea;Euryarchaeota;Methanobacteria;Methanobacteriales;Methanobacteriaceae;Methanobacterium
Methanobacterium_formicicum-BRM9-212141-212971-NZ_CP006933_1-WP_023991286_1 Archaea;Euryarchaeota;Methanobacteria;Methanobacteriales;Methanobacteriaceae;Methanobacterium

The sequences were placed in the reference tree using epa_ng.

When I run the command this is the msg that I get:

Started 2021-11-22 17:13:40

Found 1 jplace file
Reading file 1 of 1: kapilireftree-placement-out/epa_result.jplace
Running the assignment
Error: Unexpected end of input file (kapili-ppit-reference-taxonomy-forGappa-edited.txt) at 8877:1. Expected closing quotation mark.

terminate called after throwing an instance of 'std::runtime_error'
  what():  Unexpected end of input file (kapili-ppit-reference-taxonomy-forGappa-edited.txt) at 8877:1. Expected closing quotation mark.
Aborted (core dumped)

The taxonomy file that I am using has 8876 lines (taxon entries) in it.

What am I missing here?

Any help re: this would be greatly appreciated.

regards, Dhwani Desai

dhwanidesai commented 2 years ago

Nevermind! I figured out what was wrong. There was one entry in the taxonomy file which had a " character in the ID. Fixing that fixed the error. Closing this issue.