phbradley / conga

Clonotype Neighbor Graph Analysis
MIT License
79 stars 18 forks source link

Error writing adata to file #43

Closed kjkrishnan closed 2 years ago

kjkrishnan commented 2 years ago

Hello,

I've been trying to run conga analyses using the command line and the -all flag. When I run it, I seem to be getting results up until line 1070 of run_conga.py where the anndata object is being saved. The error I get is:

error writing adata to file, dropping the conga_results dict
Traceback (most recent call last):
  File "/home/user/miniconda3/envs/conga/lib/python3.9/site-packages/anndata/_io/utils.py", line 214, in func_wrapper
    return func(elem, key, val, *args, **kwargs)
  File "/home/user/miniconda3/envs/conga/lib/python3.9/site-packages/anndata/_io/specs/registry.py", line 175, in write_elem
    _REGISTRY.get_writer(dest_type, t, modifiers)(f, k, elem, *args, **kwargs)
  File "/home/user/miniconda3/envs/conga/lib/python3.9/site-packages/anndata/_io/specs/registry.py", line 64, in get_writer
    raise TypeError(
TypeError: No method has been defined for writing <class 'collections.OrderedDict'> elements to <class 'h5py._hl.group.Group'>

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/user/my_software/conga/scripts/run_conga.py", line 1070, in <module>
    adata.write_h5ad(args.outfile_prefix+'_final.h5ad')
  File "/home/user/miniconda3/envs/conga/lib/python3.9/site-packages/anndata/_core/anndata.py", line 1918, in write_h5ad
    _write_h5ad(
  File "/home/user/miniconda3/envs/conga/lib/python3.9/site-packages/anndata/_io/h5ad.py", line 105, in write_h5ad
    write_elem(f, "uns", dict(adata.uns), dataset_kwargs=dataset_kwargs)
  File "/home/user/miniconda3/envs/conga/lib/python3.9/site-packages/anndata/_io/utils.py", line 214, in func_wrapper
    return func(elem, key, val, *args, **kwargs)
  File "/home/user/miniconda3/envs/conga/lib/python3.9/site-packages/anndata/_io/specs/registry.py", line 175, in write_elem
    _REGISTRY.get_writer(dest_type, t, modifiers)(f, k, elem, *args, **kwargs)
  File "/home/user/miniconda3/envs/conga/lib/python3.9/site-packages/anndata/_io/specs/registry.py", line 24, in wrapper
    result = func(g, k, *args, **kwargs)
  File "/home/user/miniconda3/envs/conga/lib/python3.9/site-packages/anndata/_io/specs/methods.py", line 281, in write_mapping
    write_elem(g, sub_k, sub_v, dataset_kwargs=dataset_kwargs)
  File "/home/user/miniconda3/envs/conga/lib/python3.9/site-packages/anndata/_io/utils.py", line 220, in func_wrapper
    raise type(e)(
TypeError: No method has been defined for writing <class 'collections.OrderedDict'> elements to <class 'h5py._hl.group.Group'>

Above error raised while writing key 'conga_stats' of <class 'h5py._hl.group.Group'> to /

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/user/miniconda3/envs/conga/lib/python3.9/site-packages/anndata/_io/utils.py", line 214, in func_wrapper
    return func(elem, key, val, *args, **kwargs)
  File "/home/user/miniconda3/envs/conga/lib/python3.9/site-packages/anndata/_io/specs/registry.py", line 175, in write_elem
    _REGISTRY.get_writer(dest_type, t, modifiers)(f, k, elem, *args, **kwargs)
  File "/home/user/miniconda3/envs/conga/lib/python3.9/site-packages/anndata/_io/specs/registry.py", line 64, in get_writer
    raise TypeError(
TypeError: No method has been defined for writing <class 'collections.OrderedDict'> elements to <class 'h5py._hl.group.Group'>

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/user/my_software/conga/scripts/run_conga.py", line 1075, in <module>
    adata.write_h5ad(args.outfile_prefix+'_final.h5ad')
  File "/home/user/miniconda3/envs/conga/lib/python3.9/site-packages/anndata/_core/anndata.py", line 1918, in write_h5ad
    _write_h5ad(
  File "/home/user/miniconda3/envs/conga/lib/python3.9/site-packages/anndata/_io/h5ad.py", line 105, in write_h5ad
    write_elem(f, "uns", dict(adata.uns), dataset_kwargs=dataset_kwargs)
  File "/home/user/miniconda3/envs/conga/lib/python3.9/site-packages/anndata/_io/utils.py", line 214, in func_wrapper
    return func(elem, key, val, *args, **kwargs)
  File "/home/user/miniconda3/envs/conga/lib/python3.9/site-packages/anndata/_io/specs/registry.py", line 175, in write_elem
    _REGISTRY.get_writer(dest_type, t, modifiers)(f, k, elem, *args, **kwargs)
  File "/home/user/miniconda3/envs/conga/lib/python3.9/site-packages/anndata/_io/specs/registry.py", line 24, in wrapper
    result = func(g, k, *args, **kwargs)
  File "/home/user/miniconda3/envs/conga/lib/python3.9/site-packages/anndata/_io/specs/methods.py", line 281, in write_mapping
    write_elem(g, sub_k, sub_v, dataset_kwargs=dataset_kwargs)
  File "/home/user/miniconda3/envs/conga/lib/python3.9/site-packages/anndata/_io/utils.py", line 220, in func_wrapper
    raise type(e)(
TypeError: No method has been defined for writing <class 'collections.OrderedDict'> elements to <class 'h5py._hl.group.Group'>

Above error raised while writing key 'conga_stats' of <class 'h5py._hl.group.Group'> to /

I'm not super familiar with python and I am really struggling to work out what the issue is. Any thoughts?

Thank you!

phbradley commented 2 years ago

Hi there,

Thanks for giving conga a try! It looks like the scanpy AnnData saving machinery is choking on something in the conga_results dictionary, which is stored in adata.uns. Huh!

Q1. Do you get the _final_obs.tsv file? And/or the _results_summary.html file? Since that adata.write_h5ad function call is wrapped in a "try" block, the error message doesn't necessarily mean that the code stopped at that point.

Q2. Can you tell us a bit more about how you got to this point? Command line arguments? Input file formats? This is 10x data, I guess? Is this from the conga examples or your own data? Any nonstandard preprocessing (or batch labels) that might be relevant?

Thanks for the info! Take care, Phil

kjkrishnan commented 2 years ago

Q1 - No, it doesn't look like I'm getting the final_obs.tsv file or the results_summary.html file

Q2 - This is (my own) 10x data that I analyzed with Seurat and used the DropletUtils package to write to 10x_mtx format. No batch labels used at this point, and I don't think anything nonstandard was done for preprocessing. This is the command I used: /home/user/my_software/conga/scripts/run_conga.py --gex_data ../data_gex.h5ad --gex_data_type h5ad --clones_file ../data_clones.tsv --organism human --all --outfile_prefix data_conga.

Please let me know if there is any more information that would be helpful!

phbradley commented 2 years ago

OK, thanks for that info! This is really curious. For some reason the AnnData code doesn't seem to be able to save an OrderedDict (or one of its elements) to h5ad. It looks like you are using python version 3.9, is that right? I don't think we've tested in a python 3.9 env yet... I will see if I can replicate this error.

In the meantime, would you be willing to try making a change to your conga code? On this line:

https://github.com/phbradley/conga/blob/master/conga/util.py#L161

could you try changing

    adata.uns['conga_stats'] = OrderedDict()

to

    adata.uns['conga_stats'] = {}

ie, just replacing the OrderedDict with a plain old python dictionary (preserving the indentation in the code).

sschattgen commented 2 years ago

I built a fresh environment using python 3.9 and was able to replicate the error. Changing from OrderedDict() to a standard dictionary fixed the issue, as you suggested Phil. @kjkrishnan I've pushed this fix. Please reclone the repo and try running it again.

kjkrishnan commented 2 years ago

That seems to have fixed it. Thank you very much for your help!