ropensci / taxa

taxonomic classes for R
https://docs.ropensci.org/taxa
Other
48 stars 12 forks source link

When parsing classifications that have per-taxon info add input id column #92

Closed zachary-foster closed 6 years ago

zachary-foster commented 6 years ago

For data like:

AY457915\tBacteria(100);Firmicutes(99);Clostridiales(99);Johnsonella_et_rel.(99);Johnsonella_et_rel.(99);Johnsonella_et_rel.(91);Eubacterium_eligens_et_rel.(89);Lachnospira_pectinoschiza(80);
AY457914\tBacteria(100);Firmicutes(100);Clostridiales(100);Johnsonella_et_rel.(100);Johnsonella_et_rel.(100);Johnsonella_et_rel.(95);Eubacterium_eligens_et_rel.(92);Eubacterium_eligens(84);Eubacterium_eligens(81);
AY457913\tBacteria(100);Firmicutes(100);Clostridiales(100);Johnsonella_et_rel.(100);Johnsonella_et_rel.(100);Roseoburia_et_rel.(97);Roseoburia_et_rel.(97);Eubacterium_ramulus_et_rel.(90);uncultured(90);
AY457912\tBacteria(100);Firmicutes(99);Clostridiales(99);Johnsonella_et_rel.(99);Johnsonella_et_rel.(99);
AY457911\tBacteria(100);Firmicutes(99);Clostridiales(98);Ruminococcus_et_rel.(96);Anaerofilum-Faecalibacterium(92);Faecalibacterium(92);Faecalibacterium_prausnitzii(90);

I get results like:

> result$data$class_data
# A tibble: 38 x 3
                          name score taxon_id
 *                       <chr> <chr>    <chr>
 1                    Bacteria   100        b
 2                  Firmicutes    99        c
 3               Clostridiales    99        d
 4         Johnsonella_et_rel.    99        e
 5         Johnsonella_et_rel.    99        g
 6         Johnsonella_et_rel.    91        i
 7 Eubacterium_eligens_et_rel.    89        l
 8   Lachnospira_pectinoschiza    80        o
 9                    Bacteria   100        b
10                  Firmicutes   100        c
# ... with 28 more rows

An input_index column would be useful to group rows.