Arcadia-Science / sourmashconsumr

Working with the outputs of sourmash in R
https://arcadia-science.github.io/sourmashconsumr/
Other
25 stars 3 forks source link

functions to read the outputs of sourmash #4

Closed taylorreiter closed 2 years ago

taylorreiter commented 2 years ago
taylorreiter commented 2 years ago

First three addressed in #5

taylorreiter commented 2 years ago

for read_signature_csv, I was thinking of waiting to implement this until sourmash-bio/sourmash#1098 is done (see link above), but I think I can move ahead without it. One flavor of pset plots and rarefaction curves both rely on signatures in csv format, and I'd like to get these implemented sooner rather than later. So, I can either:

  1. write the code around my own sig_to_csv.py scripts with lots of tests, and sub out my stuff when sourmash-bio/sourmash#1098 is done
  2. try and write a json parser that will read signatures into data frames (or tibbles). Needs to be able to parse with and without abundances, with one or multiple kmer sizes, etc.

I'll probably try for option 2 first and see how difficult it is.

taylorreiter commented 2 years ago

See #16...it was much simpler than I thought it would be to make read_signature a function and I wish I would have done it years ago.

taylorreiter commented 2 years ago

I'm not sure it's worth writing a function for read_signature_describe...as that's just a csv. Going to close this issue for now since the other stuff is complete!