Closed ElDeveloper closed 1 year ago
Great, but don't think this solves the problem though, I would like to be able to generate the merged file from the command line/Python API. Otherwise I need to download the artifact, open it on the browser, download the file and re-upload to the cluster.
Oh! The metadata merging isn't specific to q2-metadata
! Anywhere on the cli that supports metadata supports variadic metadata files, they will all get merged!
qiime emperor plot \
--i-pcoa unweighted_unifrac_pcoa_results.qza \
--m-metadata-file sample-metadata.tsv \
--m-metadata-file faith_pd_vector.qza \
--o-visualization unweighted-unifrac-emperor-with-alpha.qzv
Maybe I am misunderstanding exactly what your problem is? Please reopen if that is the case!
That is a very useful feature, but what I really want is to be able to only have one mapping file representing my dataset, in this case I am combining previous analyses, and I want to combine their metadata. Following your example above, I have sample-metadata-foo.tsv, sample-metadata-bar.tsv, and sample-metadata-baz.tsv, all of which describe different samples. While it is indeed possible to pass --m-metadata-file N times for the N files, it would be most useful to only deal with one.
PS - I can't reopen, button is not there ¯\_(ツ)_/¯
This sounds like a useful feature! qiime2 can't currently output metadata files but that sounds like a feasible change to make in the framework. After that q2-metadata
could have a merge
command.
As of 2023.5, we have an ImmutableMetadata
semantic type in q2-types, which effectively allows actions to output metadata. ImmutableMetadata
artifacts can be used anywhere that metadata can be used, and can be exported to plain, old (mutable) metadata.
To support the most common feature requests that have been brought up, we should add a merge
action to q2-metadata which takes an arbitrary number of metadata artifacts as input and outputs an ImmutableMetadata
artifact. This could support inner and outer joins on sample ids.
To keep it simple initially, merge
will fail on cases where there are both overlapping column names and ids. In other words, if you have overlapping ids, you can't have overlapping column names (since there is potential for conflict in values - ultimately we can figure out how to handle that, but that's a bigger project), but if you don't have overlapping ids you can have overlapping column names (EDIT: this latter case is what we cover in https://github.com/qiime2/qiime2/issues/633#issuecomment-1410864878).
Addressing this will touch the framework as well, adding this functionality to qiime2.Metadata.merge
.
Another forum x-ref.
Comments Just like
merge_mapping_files.py
.