single-cell-data / TileDB-SOMA

Python and R SOMA APIs using TileDB’s cloud-native format. Ideal for single-cell data at any scale.
MIT License
79 stars 21 forks source link

TileDB logo

TileDB-SOMA Python CI TileDB-SOMA R CI PyPI version tiledbsoma status badge codecov

TileDB-SOMA

SOMA – for “Stack Of Matrices, Annotated” – is a flexible, extensible, and open-source API enabling access to data in a variety of formats. The driving use case of SOMA is for single-cell data in the form of annotated matrices where observations are frequently cells and features are genes, proteins, or genomic regions.

The TileDB-SOMA package is a C++ library with APIs in Python and R, using TileDB Embedded to implement the SOMA specification.

Get started on using TileDB-SOMA:

What Can TileDB-SOMA Do?

Intended to be used for single-cell data, TileDB-SOMA provides Python and R APIs to allow for storage and data access patterns at scale and for larger-than-memory operations:

TileDB-SOMA provides interoperability with existing single-cell toolkits:

TileDB-SOMA provides interoperability with existing Python or R data structures:

Community

APIs Installation and Quick Start

API Documentation

The TileDB-SOMA doc-site (Python|R), contains the reference documentation and tutorials.

Reference documentation can also be accessed directly from Python help(tiledsoma) or R help(package = "tiledbsoma").

Main SOMA Objects

The capabilities of TileDB-SOMA lay on the different read, access, and query patterns that each of the main implementations of SOMA objects provide:

Who Is Using SOMA?

If you are interested in listing any projects here please contact us at soma@chanzuckerberg.com.

Issues and Contacts

Branches

This branch, main, implements the updated specfication. Please also see the main-old branch which implements the original specification.

Developer Information

Code of Conduct

All participants in TileDB spaces are expected to adhere to high standards of professionalism in all interactions. This repository is governed by the specific standards and reporting procedures detailed in depth in the TileDB core repository Code Of Conduct.