chanzuckerberg / cellxgene-census

CZ CELLxGENE Discover Census
https://chanzuckerberg.github.io/cellxgene-census/
MIT License
72 stars 18 forks source link

get_census_version_directory() has `do_not_delete` key, which isn't documented or in the type definitions #1204

Closed ivirshup closed 2 days ago

ivirshup commented 1 week ago

Describe the bug

get_census_version_directory() returns a list of entries that look like:

(
    '2024-06-10',
    {
        'release_date': None,
        'release_build': '2024-06-10',
        'soma': {
            'uri': 's3://cellxgene-census-public-us-west-2/cell-census/2024-06-10/soma/',
            'relative_uri': '/cell-census/2024-06-10/soma/',
            's3_region': 'us-west-2',
        },
        'h5ads': {
            'uri': 's3://cellxgene-census-public-us-west-2/cell-census/2024-06-10/h5ads/',
            'relative_uri': '/cell-census/2024-06-10/h5ads/',
            's3_region': 'us-west-2',
        },
        'do_not_delete': False,
    },
)`

This looks like what is in the documentation, except for there is an additional do_not_delete key.

In addition the CensusVersionDescription (definition) type (which the dicts here are supposed to be) doesn't define this key.

aside: I'm not too familiar with mypy and TypedDict's, but I'm a little surprised this isn't raising a type error

Suggested action

To me, this all suggests that we are unintentionally showing this key to the user. If it's intentional, then I think we should update the docs and type.

Environment

Provide a description of your system and the software versions.

``` ----- IPython 8.24.0 cellxgene_census 1.14.2.dev4+gcfca649.d20240612 pandas 2.2.2 requests 2.32.2 session_info 1.0.0 ----- aiobotocore 2.13.0 aiohttp 3.9.5 aioitertools 0.11.0 aiosignal 1.3.1 anndata 0.10.7 asttokens NA attr 23.2.0 attrs 23.2.0 botocore 1.34.106 certifi 2024.02.02 charset_normalizer 3.3.2 cloudpickle 2.2.1 cython_runtime NA dask 2024.5.2 dateutil 2.9.0.post0 decorator 5.1.1 dill 0.3.8 executing 2.0.1 frozenlist 1.4.1 fsspec 2024.3.1 h5py 3.11.0 idna 3.7 importlib_metadata NA jedi 0.19.1 jinja2 3.1.4 jmespath 1.0.1 llvmlite 0.42.0 markupsafe 2.1.5 multidict 6.0.5 natsort 8.4.0 numba 0.59.1 numpy 1.26.4 packaging 24.0 parso 0.8.4 prompt_toolkit 3.0.45 psutil 5.9.8 pure_eval 0.2.2 pyarrow 15.0.2 pyarrow_hotfix NA pygments 2.18.0 pytz 2024.1 s3fs 2024.3.1 scipy 1.13.1 six 1.16.0 somacore 1.0.11 sparse 0.15.4 stack_data 0.6.3 tblib 1.7.0 tiledb 0.30.0 tiledbsoma 1.12.0 tlz 0.12.1 toolz 0.12.1 torch 2.2.2+cu121 torchgen NA tqdm 4.66.4 traitlets 5.14.3 typing_extensions NA urllib3 2.2.1 wcwidth 0.2.13 wrapt 1.16.0 xxhash NA yaml 6.0.1 yarl 1.9.4 zipp NA ----- Python 3.11.9 | packaged by conda-forge | (main, Apr 19 2024, 18:36:13) [GCC 12.3.0] Linux-6.8.0-1009-aws-x86_64-with-glibc2.39 ----- Session information updated at 2024-06-24 23:05 ``` ## Additional context Add any other context about the problem here.
ebezzi commented 1 week ago

Not intentional - the key is only used by the builder and has no relevance to the user. It should not be returned by the API.