HumanCellAtlas / ontology

3 stars 1 forks source link

Add terms for file formats #35

Closed malloryfreeberg closed 5 years ago

malloryfreeberg commented 5 years ago

New terms to add to EDAM under Format:

1. Zarr

Preferred term label

Zarr

Synonyms

None

Textual definition

The Zarr format is an implementation of chunked, compressed, N-dimensional arrays for storing data. (citation)

Suggested parent term

Binary format

Use case

Describing files produced by the data processing pipelines.

2. MTX

Preferred term label

MTX

Synonyms

None

Textual definition

The Matrix Market matrix (MTX) format stores numerical or pattern matrices in a dense (array format) or sparse (coordinate format) representation. (citation)

Data represented: http://edamontology.org/data_3112|http://edamontology.org/data_2535

Suggested parent term

Textual format

Use case

Describing files produced by the data processing pipelines and/or matrix service.

3. Loom

Preferred term label

Loom

Synonyms

None

Textual definition

The Loom file format is based on HDF5, a standard for storing large numerical datasets. The Loom format is designed to efficiently hold large omics datasets. Typically, such data takes the form of a large matrix of numbers, along with metadata for the rows and columns. (citation)

Data represented: http://edamontology.org/data_3112|http://edamontology.org/data_2535

Suggested parent term

Binary format

Possibly under HDF5?

Use case

Describing files produced by the data processing pipelines and/or matrix service.

daniwelter commented 5 years ago

https://github.com/edamontology/edamontology/issues/404 https://github.com/edamontology/edamontology/issues/405 https://github.com/edamontology/edamontology/issues/406