single-cell-data / SOMA

A flexible and extensible API for annotated 2D matrix data stored in multiple underlying formats.
MIT License
69 stars 9 forks source link

Clarify/implement forbiddenness of negative `soma_joinid` in the spec #121

Closed johnkerl closed 1 year ago

johnkerl commented 1 year ago

Tracking issue for https://github.com/single-cell-data/SOMA/pull/119#discussion_r1102882763 in the event that feedback needs addressing in a separate PR.

See also https://github.com/single-cell-data/TileDB-SOMA/issues/519.


Analysis results in support of https://github.com/single-cell-data/SOMA/pull/124 and https://github.com/single-cell-data/TileDB-SOMA/pull/932:

thetorpedodog commented 1 year ago

For future consideration:

Since we allow floating-point and string dimensions, it would make sense to me that we should allow users to store data with a domain that goes into the negative numbers in a DataFrame.

To be clear, I do not think this is a change we would need to make right now or even necessarily soon, just a thought for future development. I am fine with saying “no negative numbers” in the current state.

bkmartinjr commented 1 year ago

the "no negative numbers" is a misunderstanding - it was "no negative soma_joinid", a statement that was necessary because soma_joinid is always int64.

So this affects only:

I don't recall that we ever discussed (much less concluded) that negative values could not be indexed for user-defined columns. I feel pretty strongly that we need the ability to index the full domain of any user-defined column.

johnkerl commented 1 year ago

Thanks for the clarify, @bkmartinjr !!