andersen-lab / ivar

iVar is a computational package that contains functions broadly useful for viral amplicon-based sequencing.
https://andersen-lab.github.io/ivar/html/
GNU General Public License v3.0
115 stars 39 forks source link

Parsing GFF_FEATURE #174

Open jfouret opened 4 months ago

jfouret commented 4 months ago

Hi,

First, thank you for this tool. I may report a small bug, not really a bug though. The parsing of the "gene" feature seems not to be functional as intended originally. Anyway if we have an ID field, all is good at the end but see below the code to understand what I mean:

PS: It would be good to precise hos the GFF_FEATURE is parsed from the GFF file in the documentation.

You are using the following code:

  std::vector<gff3_feature>::iterator it;
  char *ref_codon, *alt_codon;
  for (it = features.begin(); it != features.end(); it++) {
    fout << line_stream.str();
    // add in gene level info, control for case it's not present
    std::string gene = it->get_attribute("gene");
    if (gene.empty()) {
      fout << it->get_attribute("ID") << "\t";
    } else {
      fout << gene + ":" + it->get_attribute("ID") << "\t";
    }

Why is it not:

  std::vector<gff3_feature>::iterator it;
  char *ref_codon, *alt_codon;
  for (it = features.begin(); it != features.end(); it++) {
    fout << line_stream.str();
    // add in gene level info, control for case it's not present
    std::string gene = it->get_attribute("gene");
    if (gene.empty()) {
      fout << it->get_attribute("ID") << "\t";
    } else {
      fout << gene + ":" + gene << "\t";
    }