biocore / empress

A fast and scalable phylogenetic tree viewer for microbiome data analysis
BSD 3-Clause "New" or "Revised" License
46 stars 31 forks source link

should --i-feature-table be optional? #175

Closed antgonza closed 4 years ago

antgonza commented 4 years ago

In a current use case, I do not care about the feature table and only about the rep-set and the rep-set metadata; thus, wondering if --i-feature-table should be optional. Thoughts?

fedarko commented 4 years ago

We just discussed this this morning. With the work @ElDeveloper has done in #174 on defining an Empress "object" (#140), it will be a lot easier to make new visualizer commands. IIRC we settled on having at minimum two options, something like

  1. qiime empress community-tree-plot, which requires at minimum a tree, table, and sample metadata (analogous to how Empress currently works, where the focus is on showing how a given dataset looks overlaid on a tree)

  2. qiime empress tree-plot, which would require only a tree

In both of the above cases, feature metadata would be optional. (Visualizer names are obvs tentative.)

The motivation for splitting this into multiple visualizers, rather than just one visualizer with an optional table input, is that having the table would also require having sample metadata (to make the table useful).

ElDeveloper commented 4 years ago

This was brought up again as potentially being useful for visualization COVID-19 related workflows in other labs.

cc @ebolyen

fedarko commented 4 years ago

There's a prototype version of this available in this branch. Current interface (subject to change) has two visualizers, as described above:


$ qiime empress
Usage: qiime empress [OPTIONS] COMMAND [ARGS]...

  Description: This QIIME 2 plugin wraps Empress and supports interactive
  visualization of phylogenetic trees.

  Plugin website: http://github.com/biocore/empress

  Getting user support: Please post to the QIIME 2 forum for help with this
  plugin: https://forum.qiime2.org

Options:
  --version    Show the version and exit.
  --citations  Show citations and exit.
  --help       Show this message and exit.

Commands:
  community-plot  Visualize phylogenies and community data with Empress (and,
                  optionally, Emperor)
  tree-plot       Visualize phylogenies with Empress

$ qiime empress tree-plot --help
Usage: qiime empress tree-plot [OPTIONS]

  Generates an interactive phylogenetic tree visualization supporting
  interaction with feature metadata.

Inputs:
  --i-tree ARTIFACT    The phylogenetic tree to visualize.
    Phylogeny[Rooted]                                               [required]
Parameters:
  --m-feature-metadata-file METADATA...
    (multiple          Feature metadata. Can be used to color nodes (tips
     arguments will    and/or internal nodes) in the tree. Features described
     be merged)        in the metadata that are not present in the tree will
                       be automatically filtered out of the visualization.
                                                                    [optional]
Outputs:
  --o-visualization VISUALIZATION
                                                                    [required]
Miscellaneous:
  --output-dir PATH    Output unspecified results to a directory
  --verbose / --quiet  Display verbose output to stdout and/or stderr during
                       execution of this action. Or silence output if
                       execution is successful (silence is golden).
  --citations          Show citations and exit.
  --help               Show this message and exit.

$ qiime empress community-plot --help
Usage: qiime empress community-plot [OPTIONS]

  Generates an interactive phylogenetic tree visualization supporting
  interaction with sample and feature metadata and, optionally, Emperor
  integration.

Inputs:
  --i-tree ARTIFACT    The phylogenetic tree to visualize.
    Phylogeny[Rooted]                                               [required]
  --i-feature-table ARTIFACT FeatureTable[Frequency]
                       A table containing the abundances of features within
                       samples. This information allows us to decorate the
                       phylogeny by sample metadata. It's expected that all
                       features in the table are also present as tips in the
                       tree, and that all samples in the table are also
                       present in the sample metadata file.         [required]
  --i-pcoa ARTIFACT    Principal coordinates matrix to display simultaneously
    PCoAResults        with the phylogenetic tree using Emperor.    [optional]
Parameters:
  --m-sample-metadata-file METADATA...
    (multiple          Sample metadata. Can be used to color tips in the tree
     arguments will    by the samples they are unique to. Samples described in
     be merged)        the metadata that are not present in the feature table
                       will be automatically filtered out of the
                       visualization.                               [required]
  --m-feature-metadata-file METADATA...
    (multiple          Feature metadata. Can be used to color nodes (tips
     arguments will    and/or internal nodes) in the tree. Features described
     be merged)        in the metadata that are not present in the tree will
                       be automatically filtered out of the visualization.
                                                                    [optional]
  --p-ignore-missing-samples / --p-no-ignore-missing-samples
                       This will suppress the error raised when the feature
                       table contains samples that are not present in the
                       sample metadata. Samples without metadata are included
                       in the visualization by setting all of their metadata
                       values to "This sample has no metadata". Note that this
                       flag will only be applied if at least one sample is
                       present in both the feature table and the metadata.
                                                              [default: False]
  --p-filter-extra-samples / --p-no-filter-extra-samples
                       This will suppress the error raised when samples in
                       the feature table are not included in the ordination.
                       These samples will be will be removed from the
                       visualization if this flag is passed. Note that this
                       flag will only be applied if at least one sample in the
                       table is also present in the ordination.
                                                              [default: False]
  --p-filter-missing-features / --p-no-filter-missing-features
                       This will suppress the error raised when the feature
                       table contains features that are not present as tips in
                       the tree. These features will be removed from the
                       visualization if this flag is passed. Note that this
                       flag will only be applied if at least one feature in
                       the table is also present as a tip in the tree.
                                                              [default: False]
  --p-number-of-features INTEGER
    Range(1, None)     The number of most important features (arrows) to
                       display in the ordination. "Importance" is calculated
                       for each feature based on the vector’s magnitude
                       (euclidean distance from origin). Note, this parameter
                       is only honored when a biplot is inputed.  [default: 5]
  --p-filter-unobserved-features-from-phylogeny /
  --p-no-filter-unobserved-features-from-phylogeny
                       If this flag is passed, filters features from the
                       phylogeny that are not present as features in feature
                       table. Default is True.                 [default: True]
Outputs:
  --o-visualization VISUALIZATION
                                                                    [required]
Miscellaneous:
  --output-dir PATH    Output unspecified results to a directory
  --verbose / --quiet  Display verbose output to stdout and/or stderr during
                       execution of this action. Or silence output if
                       execution is successful (silence is golden).
  --citations          Show citations and exit.
  --help               Show this message and exit.
ElDeveloper commented 4 years ago

Fairly personal opinion: after seeing this in a more tangible form I think keeping just qiime empress plot with the feature table as an optional argument makes more sense. Other thoughts?


Thanks for putting this together!

fedarko commented 4 years ago

Thanks!

I would really prefer to keep this split up into two separate commands, because making the feature table / sample metadata optional would mean that it'd be possible for the user to pass one of those but not both of them. This seems to me like it would confuse users.

Without the table the sample metadata isn't useful (no way to map the sample metadata groups to the tree tips), and without the sample metadata the table isn't useful (no way to interpret the meanings of samples, unless I guess we're showing an Emperor plot with sample metadata -- in which case the sample metadata should have been passed to Empress in the first place).

We could keep everything as one command and then add some code to raise an error if the user provides only one of the table and sample metadata, but honestly that seems like it would complicate the interface too much.

ElDeveloper commented 4 years ago

Thanks, I forgot about how to handle the optional sample metadata. Yes, this is very confusing in other plugins like qiime feature-table filter.....

Thanks!

Yoshiki.

On (Sep-01-20|15:11), Marcus Fedarko wrote:

Thanks!

I would really prefer to keep this split up into two separate commands, because making the feature table / sample metadata optional would mean that it'd be possible for the user to pass one of those but not both of them. This seems to me like it would confuse users.

Without the table the sample metadata isn't useful (no way to map the sample metadata groups to the tree tips), and without the sample metadata the table isn't useful (no way to interpret the meanings of samples, unless I guess we're showing an Emperor plot with sample metadata -- in which case the sample metadata should have been passed to Empress in the first place).

We could keep everything as one command and then add some code to raise an error if the user provides only one of the table and sample metadata, but honestly that seems like it would complicate the interface too much.

-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://urldefense.com/v3/__https://github.com/biocore/empress/issues/175*issuecomment-685160744__;Iw!!Mih3wA!VHjqCv5q8R-OXelQu32XG2I45nubgVXypfi0W5EM5hwt63g7PvzRIki_9GkLt8E$