chanzuckerberg / cellxgene-census

CZ CELLxGENE Discover Census
https://chanzuckerberg.github.io/cellxgene-census/
MIT License
83 stars 20 forks source link

Improve documentation for supported parameters in get_anndata() method #1032

Open hthomas-czi opened 7 months ago

hthomas-czi commented 7 months ago

Without being familiar with the schema, it's not clear which obs columns users can selected using the obs_value_filter argument.

For example, users can use sex_ontology_term_id or human-readable sex, but that's not clear in the method signature or documentation.

Proposal

Questions

obs_coords – Coordinates for the obs axis, which is indexed by the soma_joinid value. May be an int, a list of int, or a slice. The default, None, selects all.

However, the method signature lists many more possible types. In addition the being inconsistent, the method signature is hard to interpret.

obs_coords: None | bytes | Slice[bytes] | Sequence[bytes] | float | Slice[float] | Sequence[float] | int | Slice[int] | Sequence[int] | slice | Slice[slice] | Sequence[slice] | str | Slice[str] | Sequence[str] | datetime64 | Slice[datetime64] | Sequence[datetime64] | TimestampType | Slice[TimestampType] | Sequence[TimestampType] | Array | ChunkedArray | ndarray[Any, dtype[integer]] | ndarray[Any, dtype[datetime64]] = None,

Which is correct and can we simplify the method signature type definitions to make it easier to understand?

pablo-gar commented 7 months ago

@hthomas-czi I would follow your lead with with recommendations.

hthomas-czi commented 6 months ago

I would follow your lead with with recommendations.

I added some proposals with a couple questions for @ebezzi