biocore / emperor

Emperor a tool for the analysis and visualization of large microbial ecology datasets
http://biocore.github.io/emperor/
Other
52 stars 50 forks source link

biplot parameter rename? #726

Open mortonjt opened 5 years ago

mortonjt commented 5 years ago

There has been a bit of confusion on how to generate multiomics biplots using mmvec.

This mainly boils down to the usage of sample-metadata-file and feature-metadata-file - this can get confusing with multiomics since sample-metadata-file is typically another type of feature-metadata-file.

The easiest solution that I can think of is to introduce another command called multiomics-biplot and rename the existing biplot command to sample-biplot. In this way, multiomics-biplot can be typed so that it can accept two feature-metadata-file arguments. Furthermore, we can change the argument names to something like point-feature-metadata-file and arrow-feature-metadata-file to reduce the confusion when trying to use this.

Thoughts? Also will be happy to contribute.

ElDeveloper commented 5 years ago

Thanks @mortonjt, this is a good idea. I agree that the names of the inputs could be more generic. For now, it is way too specific to the sample-vs-feature case. However, instead of going the route of having a multiomics-biplot, I would like to fix the biplot command to use a better naming convention. If we think about it from the perspective of q2cli, and following your recommendation it would look like this:

# For the feature vs feature (mmvec) case
qiime emperor biplot \
--m-point-metadata metabolite-annotations.tsv \
--m-arrow-metadata microbe-metadata.tsv \
--i-ordination mmvec.results.qza \
--o-visualization biplot.qzv

# for the sample vs feature case
qiime emperor biplot \
--m-point-metadata sample-metadata.tsv \
--m-arrow-metadata feature-metadata.tsv \
--i-ordination pcoa-biplot.results.qza \
--o-visualization biplot.qzv

The code itself could check to ensure that there's an overlap between the point metadata and the points in the ordination object, as well as checking for the arrow metadata and the arrows in the ordination. I think that looks fairly straight forward. What do you think? Maybe we could refine the point and arrow names?

Another thing that might help is to check and raise a meaningful error if users accidentally inverted the order of the metadata. This could be checked based on an identifier match between the metadata files. For example: The point and arrow metadata seem to be in an incorrect order. The point metadata matches the arrows in your ordination, and the arrow metadata matches the points in your ordination. Or something to that effect.

mortonjt commented 5 years ago

I think renaming to --m-point-metadata and --m-arrow-metadata makes a lot of sense. And checking for the point / arrow metadata match is a good idea!