KoslickiLab / YACHT

A mathematically characterized hypothesis test for organism presence/absence in a metagenome
MIT License
29 stars 7 forks source link

Output format needs attention #13

Closed dkoslicki closed 1 year ago

dkoslicki commented 1 year ago
  1. Currently, the output gives a CSV file with a row for each training organism. This is great for diagnostic purposes, but not for an end user (who may be using a very large database). We should make the default behavior for it to only return the organisms actually predicted to be in the sample.
  2. Occasionally, the genome_name field will be blank. That's a bug
  3. We should have options for popular output formats. Things like the CAMI profiling format, BIOM format, something compatible with GraphPlAn, etc.
  4. Columns in the output were mainly for our internal use. We will need to think of what we want the end user to see and how to present/name it.
chunyuma commented 1 year ago

The 1-3 points have been resolved in https://github.com/KoslickiLab/YACHT-reproducibles.

dkoslicki commented 1 year ago

That's great to hear @chunyuma , please be sure those fixes make their way to this main repo. Ideally, the reproducibles repo will utilize the code contained in this one (i.e. this is the authoritative YACHT repo)

chunyuma commented 1 year ago

It's done. So I close this issue.