biothings / mygeneset.info

Apache License 2.0
5 stars 3 forks source link

Enhancement: Allow export of genesets to various file formats (.csv, .gmt, etc.) #16

Closed ravila4 closed 1 year ago

ravila4 commented 3 years ago

GMT (Gene Matrix Transposed) files is a standard format used by gene set enrichment analysis software such as GSEA.

The GMT format is tab-delimited with one geneset per row containing: geneset name, followed by a description, then a list of tab-separated gene IDS:

Name (tab) Description (tab) Gene (tab) Gene (tab)

Perhaps it would be useful for users to be able to export gene sets in this format with their identifiers of choice.

ravila4 commented 3 years ago

Other potentially useful formats are:

image_1

ravila4 commented 3 years ago

This feature is addressed on the frontend in this PR: https://github.com/biothings/mygeneset.info-website/pull/21

On the backend, Biothings allows response formatting in several file formats: json, yaml, html, msgpack (code) Example: http://mygeneset.info/v1/query?q=_id:GO_0004568_9606&format=yaml

However, writing custom formatters for csv, tsv, and gmt file formats would likely be very specific to mygeneset, and not apply to the other biothings APIs.

ravila4 commented 1 year ago

This is already implemented in the frontend. I don't think we have plans to do so in the backend.