biocore / biom-format

The Biological Observation Matrix (BIOM) Format Project
http://biom-format.org
Other
89 stars 95 forks source link

Exporting a tsv table and metadata file separately from a BIOM format file. #820

Closed matomoniwano closed 4 years ago

matomoniwano commented 5 years ago

In order to exchange contingency tables and the corresponding metadata among other researchers, I was packaging a tsv table with sample and observation metadata into BIOM file with biom add-metadata command.

However, I was not able to find a way to export a tsv table and its metadata SEPARATELY from the BIOM file, meaning I would like to have two output files, a tsv table and a text file that includes all the metadata that I added earlier (sample name, taxonomy, environmental data, and so on.).

There has been a relevant issue in the past, (/issues/632), but it's been 4 years ever since this issue was closed.

I was wondering if there is any way that I can export a table and metadata separately or the developers are working on the reverse function of biom add-metadata .

Thanks

NOTE: I do not wish to use the convert option --header-key 'taxonomy' --tsv-metadata-formatter since the metadata is merged in the table.

raissameyer commented 4 years ago

Hi BIOM dev team,

Running into the same problem. I'd really like to work with BIOM files but if one can't handle the components of the biom file separately I think it would impede exchangeability.

I tried pretty much the same thing that @matomoniwano described above.

Any advice would be greatly appreciated!

wasade commented 4 years ago

Hi Raissa, thank you for reaching out. We currently do not have a way to do this from the command line but it can be done through the Python API without too many steps. See here for an example of obtaining metadata on an axis in a pandas DataFrame.

We would certainly welcome a pull request to expose this functionality! But we haven't been able to get to this particular task yet.

On Fri, Nov 15, 2019, 7:49 AM Raissa Meyer notifications@github.com wrote:

Hi BIOM dev team,

Running into the same problem. I'd really like to work with BIOM files but if one can't handle the components of the biom file separately I think it would impede exchangeability.

I tried pretty much the same thing that @matomoniwano https://github.com/matomoniwano described above.

Any advice would be greatly appreciated!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/biocore/biom-format/issues/820?email_source=notifications&email_token=AADTZMXPTALHH57PGRJJRR3QT3AIPA5CNFSM4IIFN7M2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEF22TY#issuecomment-554413391, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADTZMWVU7KBCWZP2BOA2YTQT3AIPANCNFSM4IIFN7MQ .

raissameyer commented 4 years ago

Hi @wasade thanks for the quick response, I'll check it out.

@pieterprovoost this is the issue we discussed in the call. Linking you to it in case you're interested in checking it out 😃

wasade commented 4 years ago

Great!! If there is interest in a PR, I would aim to rapidly review and additionally to cut a micro release quickly after merge.

pieterprovoost commented 4 years ago

@wasade Is it acceptable if I just wrap metadata_to_dataframe() in the CLI? I have a proof-of-concept here: https://github.com/pieterprovoost/biom-format/commit/c6a7d0f4192eebd9eadaec8c0bc389c755bb6622

wasade commented 4 years ago

@pieterprovoost, yes! That's wonderful! Would you like to issue that as a PR?

pieterprovoost commented 4 years ago

Created PR https://github.com/biocore/biom-format/pull/826

wasade commented 4 years ago

Fixed in #826. Thanks again @pieterprovoost -- I'll see if we can issue a release this week.