kmayerb / tcrdist3

flexible CDR based distance metrics
MIT License
53 stars 17 forks source link

ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject #66

Closed swapnil-dk closed 2 years ago

swapnil-dk commented 2 years ago

I am using tcrdist3==0.2.0 with numpy==1.20.0. Because of changes in C API of numpy I am getting ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject.

I tried to uninstall and install numpy versions from 1.19.0 to 1.20.3 but I am still get the same error. Here's the error log for reference: Traceback (most recent call last): File "<string>", line 1, in <module> File "/usr/local/lib/python3.8/dist-packages/tcrdist/rep_funcs.py", line 11, in <module> from tcrdist import memory File "/usr/local/lib/python3.8/dist-packages/tcrdist/memory.py", line 8, in <module> from hierdiff.tally import neighborhood_tally File "/usr/local/lib/python3.8/dist-packages/hierdiff/__init__.py", line 3, in <module> from .association_testing import cluster_association_test File "/usr/local/lib/python3.8/dist-packages/hierdiff/association_testing.py", line 15, in <module> from fishersapi import fishers_vec, fishers_frame, adjustnonnan File "/usr/local/lib/python3.8/dist-packages/fishersapi/__init__.py", line 3, in <module> from .fishersapi import * File "/usr/local/lib/python3.8/dist-packages/fishersapi/fishersapi.py", line 66, in <module> import fisher File "/usr/local/lib/python3.8/dist-packages/fisher/__init__.py", line 3, in <module> from .cfisher import * File "__init__.pxd", line 242, in init cfisher ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject

kmayerb commented 2 years ago

Thanks for bringing this to our attention. What command did you run, that gave the error (seems like from tcrdist import memory perhaps calling from fishersapi import fishers_vec. We will investigate this and try to get back to you.

I just tested a version running fine pip installing tcrdist in a fresh python env with conda, numpy==1.20.3

Python 3.8.12 (default, Oct 12 2021, 06:23:56)
[Clang 10.0.0 ] :: Anaconda, Inc. on darwin
appnope==0.1.2
backcall==0.2.0
certifi==2021.10.8
cycler==0.11.0
decorator==5.1.0
dill==0.3.4
feather-format==0.4.1
fisher==0.1.9
fishersapi==0.3
fonttools==4.28.3
hierdiff==0.8
ipython==7.30.1
jedi==0.18.1
Jinja2==3.0.3
joblib==1.1.0
kiwisolver==1.3.2
llvmlite==0.37.0
MarkupSafe==2.0.1
matplotlib==3.5.1
matplotlib-inline==0.1.3
numba==0.54.1
numpy==1.20.3
olga==1.2.4
packaging==21.3
palmotif==0.4
pandas==1.3.5
parasail==1.2.4
parmap==1.5.3
parso==0.8.3
patsy==0.5.2
pexpect==4.8.0
pickleshare==0.7.5
Pillow==8.4.0
progress==1.6
prompt-toolkit==3.0.24
ptyprocess==0.7.0
pwseqdist==0.6
pyarrow==6.0.1
Pygments==2.10.0
pynndescent==0.5.5
pyparsing==3.0.6
python-dateutil==2.8.2
pytz==2021.3
scikit-learn==1.0.1
scipy==1.7.3
seaborn==0.11.2
six==1.16.0
sklearn==0.0
statsmodels==0.13.1
svgwrite==1.4.1
tcrdist3 @ git+https://github.com/kmayerb/tcrdist3.git@04b0b8c2573d04a9d2cb77f7a3aeeed3a0eab167
tcrsampler==0.1.9
threadpoolctl==3.0.0
tqdm==4.62.3
traitlets==5.1.1
umap-learn==0.5.2
wcwidth==0.2.5
zipdist==0.1.5

Until we can replicate this issue, you might want to create a fresh environment or use Docker image.

swapnil-dk commented 2 years ago

Thanks for your reply. When we built the docker version 0.2.2 on a local system we had the same error after running from tcrdist.rep_funcs import compute_pw_sparse_out_of_memory. I'm trying to install with your env file given above. I'll update once I try it.

kmayerb commented 2 years ago

Great.

Note that from tcrdist.rep_funcs import compute_pw_sparse_out_of_memory was developed before we had a more powerful way to do the same:

import os 
import pandas as pd
from tcrdist.repertoire import TCRrep
from tcrdist.sparse import  add_sparse_pwd

tr = TCRrep(cell_df = df
    organism = 'human', 
    chains = ['alpha','beta'],
    deduplicate = True,
    compute_distances = False)

# APPENDIX A: NOTE THE WE CAN COMPUTE ALL PAIRWISE DISTANCES IN SPARSE FORMAT
tr.compute_sparse_rect_distances(df = tr.clone_df, df2=  tr.clone_df, radius = 100)

# If you want to add sparse matrices with different entries
tr.rw_alpha_beta = add_sparse_pwd(tr.rw_beta,tr.rw_alpha)

This has benefit that you can use radius argument to dump all pairwise distance above some threshold. E.g., if radius is 100, only pairwise distances <= 100 will be retained.

swapnil-dk commented 2 years ago

I was able to get tcrdist running again with the env file above. Thank you for your help. Thanks for the tip on add_sparse_pwd.