qiime2 / q2-metadata

BSD 3-Clause "New" or "Revised" License
3 stars 17 forks source link

IMP: add `merge` action #11

Closed ElDeveloper closed 1 year ago

ElDeveloper commented 6 years ago

Comments Just like merge_mapping_files.py.

ebolyen commented 6 years ago

Check out: https://docs.qiime2.org/2017.9/tutorials/metadata/#combining-metadata 😀

ElDeveloper commented 6 years ago

Great, but don't think this solves the problem though, I would like to be able to generate the merged file from the command line/Python API. Otherwise I need to download the artifact, open it on the browser, download the file and re-upload to the cluster.

thermokarst commented 6 years ago

Oh! The metadata merging isn't specific to q2-metadata! Anywhere on the cli that supports metadata supports variadic metadata files, they will all get merged!

qiime emperor plot \
  --i-pcoa unweighted_unifrac_pcoa_results.qza \
  --m-metadata-file sample-metadata.tsv \
  --m-metadata-file faith_pd_vector.qza \
  --o-visualization unweighted-unifrac-emperor-with-alpha.qzv

Maybe I am misunderstanding exactly what your problem is? Please reopen if that is the case!

ElDeveloper commented 6 years ago

That is a very useful feature, but what I really want is to be able to only have one mapping file representing my dataset, in this case I am combining previous analyses, and I want to combine their metadata. Following your example above, I have sample-metadata-foo.tsv, sample-metadata-bar.tsv, and sample-metadata-baz.tsv, all of which describe different samples. While it is indeed possible to pass --m-metadata-file N times for the N files, it would be most useful to only deal with one.

PS - I can't reopen, button is not there ¯\_(ツ)_/¯

jairideout commented 6 years ago

This sounds like a useful feature! qiime2 can't currently output metadata files but that sounds like a feasible change to make in the framework. After that q2-metadata could have a merge command.

nbokulich commented 6 years ago

forum xref

gregcaporaso commented 1 year ago

As of 2023.5, we have an ImmutableMetadata semantic type in q2-types, which effectively allows actions to output metadata. ImmutableMetadata artifacts can be used anywhere that metadata can be used, and can be exported to plain, old (mutable) metadata.

To support the most common feature requests that have been brought up, we should add a merge action to q2-metadata which takes an arbitrary number of metadata artifacts as input and outputs an ImmutableMetadata artifact. This could support inner and outer joins on sample ids.

To keep it simple initially, merge will fail on cases where there are both overlapping column names and ids. In other words, if you have overlapping ids, you can't have overlapping column names (since there is potential for conflict in values - ultimately we can figure out how to handle that, but that's a bigger project), but if you don't have overlapping ids you can have overlapping column names (EDIT: this latter case is what we cover in https://github.com/qiime2/qiime2/issues/633#issuecomment-1410864878).

Addressing this will touch the framework as well, adding this functionality to qiime2.Metadata.merge.

gregcaporaso commented 1 year ago

Another forum x-ref.