cellannotation / cas-tools

Cell Annotation Schema Tools
1 stars 0 forks source link

Flatten operation throws an exception but completes #88

Closed hkir-dev closed 2 days ago

hkir-dev commented 2 weeks ago

While flattening Siletti non-neuronal CAS, cas-tools is throwing an exception but it generates a h5ad file and output looks OK.

(venv) hk9@mib118717s notebooks % cas flatten --json /Users/hk9/workspaces/workspace3/tdt_repos/human-brain-cell-atlas_v1_non-neuronal/CS202210140.json --anndata /Users/hk9/tdt_datasets/b165f033-9dec-468a-9248-802fc6902a74.h5ad --output /Users/hk9/tdt_datasets/CS202210140_flattened.h5ad
Traceback (most recent call last):
  File "/Users/hk9/workspaces/workspace3/cas-tools/notebooks/venv/bin/cas", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/Users/hk9/workspaces/workspace3/cas-tools/notebooks/venv/lib/python3.11/site-packages/cas/__main__.py", line 56, in main
    flatten(json_file_path, anndata_file_path, output_file_path)
  File "/Users/hk9/workspaces/workspace3/cas-tools/notebooks/venv/lib/python3.11/site-packages/cas/flatten_data_to_anndata.py", line 62, in flatten
    flatten_cas_object(input_json, anndata_file_path, output_file_path)
  File "/Users/hk9/workspaces/workspace3/cas-tools/notebooks/venv/lib/python3.11/site-packages/cas/flatten_data_to_anndata.py", line 94, in flatten_cas_object
    write_json_to_hdf5(uns_dataset, uns_json)
  File "/Users/hk9/workspaces/workspace3/cas-tools/notebooks/venv/lib/python3.11/site-packages/cas/file_utils.py", line 265, in write_json_to_hdf5
    group.create_dataset(key, data=value)
  File "/Users/hk9/workspaces/workspace3/cas-tools/notebooks/venv/lib/python3.11/site-packages/h5py/_hl/group.py", line 183, in create_dataset
    dsid = dataset.make_new_dset(group, shape, dtype, data, name, **kwds)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hk9/workspaces/workspace3/cas-tools/notebooks/venv/lib/python3.11/site-packages/h5py/_hl/dataset.py", line 163, in make_new_dset
    dset_id = h5d.create(parent.id, name, tid, sid, dcpl=dcpl, dapl=dapl)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5d.pyx", line 137, in h5py.h5d.create
ValueError: Unable to create dataset (name already exists)
dosumis commented 1 week ago

Possible because general dataset fields are now overwriting core CxG fields? Need to be careful of this as we need to support roundtripping.

hkir-dev commented 2 days ago

CAS json updated at https://github.com/brain-bican/human-brain-cell-atlas_v1_non-neuronal/blob/main/CS202210140.json and author annotations bug fixed