atarashansky / SAMap

SAMap: Mapping single-cell RNA sequencing datasets from evolutionarily distant organisms.
MIT License
63 stars 19 forks source link

Numba Dispatcher Issue with sm.run() #108

Open mikepassal opened 1 year ago

mikepassal commented 1 year ago

Hi, I'm attempting to integrate two datasets and I'm running into two issues. I don't think they are related but wanted to include my efforts to solve the first incase it was relevant to the second.

First - when initializing the SAMAP object with

sm = SAMAP(
        filenames,
        f_maps ='  maps/',
    )

I run into the following error: TypeError: can only concatenate str (not "int") to str

To resolve it, I modified the _calculate_blast_graph function, adding two lines to set the index as a string :

def _calculate_blast_graph(ids, f_maps="maps/", eval_thr=1e-6, reciprocate=False):
    gns = []
    Xs=[]
    Ys=[]
    Vs=[]

    for i in range(len(ids)):
        id1=ids[i]
        for j in range(i,len(ids)):
            id2=ids[j]
            if i!=j:
                if os.path.exists(f_maps + "{}{}".format(id1, id2)):
                    fA = f_maps + "{}{}/{}_to_{}.txt".format(id1, id2, id1, id2)
                    fB = f_maps + "{}{}/{}_to_{}.txt".format(id1, id2, id2, id1)
                elif os.path.exists(f_maps + "{}{}".format(id2, id1)):
                    fA = f_maps + "{}{}/{}_to_{}.txt".format(id2, id1, id1, id2)
                    fB = f_maps + "{}{}/{}_to_{}.txt".format(id2, id1, id2, id1)
                else:
                    raise FileExistsError(
                        "BLAST mapping tables with the input IDs ({} and {}) not found in the specified path.".format(
                            id1, id2
                        )
                    )

                A = pd.read_csv(fA, sep="\t", header=None, index_col=0)
                B = pd.read_csv(fB, sep="\t", header=None, index_col=0)

                A.columns = A.columns.astype("<U100")
                B.columns = B.columns.astype("<U100")

                A = A[A.index.astype("str") != "nan"]
                A = A[A.iloc[:, 0].astype("str") != "nan"]
                B = B[B.index.astype("str") != "nan"]
                B = B[B.iloc[:, 0].astype("str") != "nan"]

                A.index = A.index.astype('str')#### THESE TWO LINES are the modification
                B.index = B.index.astype('str')

This fixes the error.

When I attempt sm.run(), I run into a new error that I don't know enough about Numba to attempt to deal with.


TypingError                               Traceback (most recent call last)
/tmp/ipykernel_877879/2571156453.py in 
----> 1 sm.run()
      2 samap = sm.samap # SAM object with three species stitched together

[~/miniconda3/envs/SAMap/lib/python3.7/site-packages/samap/mapping.py](https://vscode-remote+ssh-002dremote-002brugen20.vscode-resource.vscode-cdn.net/data/passala/git/Coexpressalog_Method_Development/rice_maize_data/~/miniconda3/envs/SAMap/lib/python3.7/site-packages/samap/mapping.py) in run(self, NUMITERS, NHS, crossK, N_GENE_CHUNKS, umap, ncpus, hom_edge_thr, hom_edge_mode, scale_edges_by_corr, neigh_from_keys, pairwise)
    306             scale_edges_by_corr = scale_edges_by_corr,
    307             neigh_from_keys=neigh_from_keys,
--> 308             pairwise=pairwise
    309         )
    310         samap = smap.final_sam

[~/miniconda3/envs/SAMap/lib/python3.7/site-packages/samap/mapping.py](https://vscode-remote+ssh-002dremote-002brugen20.vscode-resource.vscode-cdn.net/data/passala/git/Coexpressalog_Method_Development/rice_maize_data/~/miniconda3/envs/SAMap/lib/python3.7/site-packages/samap/mapping.py) in run(self, NUMITERS, NHS, K, corr_mode, NCLUSTERS, scale_edges_by_corr, THR, neigh_from_keys, pairwise, ncpus)
    726                 print("Calculating gene-gene correlations in the homology graph...")
    727                 self.samap = sam4
--> 728                 gnnmu = self.refine_homology_graph(ncpus = ncpus,  NCLUSTERS = NCLUSTERS,  THR=THR, corr_mode=corr_mode)
    729 
    730                 self.GNNMS_corr.append(gnnmu)

[~/miniconda3/envs/SAMap/lib/python3.7/site-packages/samap/mapping.py](https://vscode-remote+ssh-002dremote-002brugen20.vscode-resource.vscode-cdn.net/data/passala/git/Coexpressalog_Method_Development/rice_maize_data/~/miniconda3/envs/SAMap/lib/python3.7/site-packages/samap/mapping.py) in refine_homology_graph(self, NCLUSTERS, ncpus, THR, corr_mode, wscale)
    664             ncpus=ncpus,
    665             corr_mode=corr_mode,
--> 666             wscale=wscale
    667         )
    668         return gnnmu

[~/miniconda3/envs/SAMap/lib/python3.7/site-packages/samap/mapping.py](https://vscode-remote+ssh-002dremote-002brugen20.vscode-resource.vscode-cdn.net/data/passala/git/Coexpressalog_Method_Development/rice_maize_data/~/miniconda3/envs/SAMap/lib/python3.7/site-packages/samap/mapping.py) in _refine_corr(sams, st, gnnm, gns_dict, corr_mode, THR, use_seq, T1, NCLUSTERS, ncpus, wscale)
   1014             T1=T1,
   1015             ncpus=ncpus,
-> 1016             wscale=wscale
   1017         )
   1018         GNNMSUBS.append(gnnm2_sub)

[~/miniconda3/envs/SAMap/lib/python3.7/site-packages/samap/mapping.py](https://vscode-remote+ssh-002dremote-002brugen20.vscode-resource.vscode-cdn.net/data/passala/git/Coexpressalog_Method_Development/rice_maize_data/~/miniconda3/envs/SAMap/lib/python3.7/site-packages/samap/mapping.py) in _refine_corr_parallel(sams, st, gnnm, gns_dict, corr_mode, THR, use_seq, T1, ncpus, wscale)
   1562         sixs.append(np.where(species==sid)[0])
   1563 
-> 1564     vals = _refine_corr_kernel(p,ps,sidss,sixs,Xavg.indptr,Xavg.indices,Xavg.data,Xavg.shape[0], corr_mode)
   1565     vals[np.isnan(vals)]=0
   1566 

[~/miniconda3/envs/SAMap/lib/python3.7/site-packages/numba/core/dispatcher.py](https://vscode-remote+ssh-002dremote-002brugen20.vscode-resource.vscode-cdn.net/data/passala/git/Coexpressalog_Method_Development/rice_maize_data/~/miniconda3/envs/SAMap/lib/python3.7/site-packages/numba/core/dispatcher.py) in _compile_for_args(self, *args, **kws)
    466                 e.patch_message(msg)
    467 
--> 468             error_rewrite(e, 'typing')
    469         except errors.UnsupportedError as e:
    470             # Something unsupported is present in the user code, add help info

[~/miniconda3/envs/SAMap/lib/python3.7/site-packages/numba/core/dispatcher.py](https://vscode-remote+ssh-002dremote-002brugen20.vscode-resource.vscode-cdn.net/data/passala/git/Coexpressalog_Method_Development/rice_maize_data/~/miniconda3/envs/SAMap/lib/python3.7/site-packages/numba/core/dispatcher.py) in error_rewrite(e, issue_type)
    407                 raise e
    408             else:
--> 409                 raise e.with_traceback(None)
    410 
    411         argtypes = []
...

        a1, a2 = ps1[j], ps2[j]
        ix1 = d[a1]
        ^

Do you have any insight into what might be the issue? Sorry for the wall of text! And thank you for any help you might be able to provide! (Happy to provide any info you might need, this is the first time I've submitted a github issue so sorry if anything is formatted wrong 😅 )

atarashansky commented 1 year ago

Thanks for catching this! Looks like a pandas update broke everything :'D

Can you output the result of pip list and paste it here?

mikepassal commented 1 year ago

Of course! Its a fresh environment so should be pretty minimal. My condolences about the pandas update, sounds like it's a pain! (SAMap) pip list Package Version


anndata 0.8.0 backcall 0.2.0 backports.functools-lru-cache 1.6.4 bleach 6.0.0 bokeh 2.4.3 Bottleneck 1.3.5 certifi 2022.12.7 charset-normalizer 3.0.1 colorama 0.4.6 colorcet 3.0.1 cycler 0.11.0 debugpy 1.5.1 decorator 5.1.1 dill 0.3.6 dunamai 1.16.0 entrypoints 0.4 fast-histogram 0.11 fonttools 4.38.0 get_version 3.5.1 h5py 3.8.0 harmonypy 0.0.9 hnswlib 0.7.0 holoviews-samap 1.0.1 idna 3.4 igraph 0.10.1 importlib-metadata 6.0.0 ipykernel 6.15.0 ipython 7.33.0 jedi 0.18.2 Jinja2 3.1.2 joblib 1.2.0 jupyter-client 7.0.6 jupyter_core 4.11.1 kiwisolver 1.4.4 legacy-api-wrap 0.0.0 leidenalg 0.9.0 llvmlite 0.39.1 Markdown 3.4.1 MarkupSafe 2.1.2 matplotlib 3.5.3 matplotlib-inline 0.1.6 mock 5.0.1 munkres 1.1.4 natsort 8.3.0 nest-asyncio 1.5.6 networkx 2.7 numba 0.56.4 numexpr 2.8.4 numpy 1.21.6 packaging 23.0 pandas 1.3.5 panel 0.14.3 param 1.12.3 parso 0.8.3 patsy 0.5.3 pexpect 4.8.0 pickleshare 0.7.5 Pillow 9.4.0 pip 23.0.1 prompt-toolkit 3.0.36 psutil 5.9.0 ptyprocess 0.7.0 pybind11 2.10.0 pybind11-global 2.10.0 pyct 0.5.0 Pygments 2.14.0 pynndescent 0.5.8 pyparsing 3.0.9 python-dateutil 2.8.2 pytz 2022.7.1 pyviz-comms 2.2.1 PyYAML 6.0 pyzmq 19.0.2 requests 2.28.2 sam-algorithm 1.0.2 samap 1.0.14 scanpy 1.8.2 scikit-learn 1.0.2 scipy 1.7.3 seaborn 0.12.2 session-info 1.0.0 setuptools 67.4.0 sinfo 0.3.1 six 1.16.0 statsmodels 0.13.5 stdlib-list 0.8.0 tables 3.7.0 texttable 1.6.7 threadpoolctl 3.1.0 tornado 6.2 tqdm 4.64.1 traitlets 5.9.0 typing_extensions 4.5.0 umap-learn 0.5.3 unicodedata2 14.0.0 urllib3 1.26.14 wcwidth 0.2.6 webencodings 0.5.1 wheel 0.38.4 zipp 3.15.0

xgrau commented 1 year ago

I think I'm encountering one (or maybe two?) issues realted to the one mentioned by the OP.

Specifically, when running sm.run(), I get this:

Calculating gene-gene correlations in the homology graph...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 47, in <module>
  File "/home/xavi/miniconda3/envs/samap/lib/python3.7/site-packages/samap/mapping.py", line 308, in run
    pairwise=pairwise
  File "/home/xavi/miniconda3/envs/samap/lib/python3.7/site-packages/samap/mapping.py", line 728, in run
    gnnmu = self.refine_homology_graph(ncpus = ncpus,  NCLUSTERS = NCLUSTERS,  THR=THR, corr_mode=corr_mode)
  File "/home/xavi/miniconda3/envs/samap/lib/python3.7/site-packages/samap/mapping.py", line 666, in refine_homology_graph
    wscale=wscale
  File "/home/xavi/miniconda3/envs/samap/lib/python3.7/site-packages/samap/mapping.py", line 1019, in _refine_corr
    wscale=wscale
  File "/home/xavi/miniconda3/envs/samap/lib/python3.7/site-packages/samap/mapping.py", line 1563, in _refine_corr_parallel
    vals = _refine_corr_kernel(p,ps,sidss,sixs,Xavg.indptr,Xavg.indices,Xavg.data,Xavg.shape[0], corr_mode)
  File "/home/xavi/miniconda3/envs/samap/lib/python3.7/site-packages/numba/core/dispatcher.py", line 414, in _compile_for_args
    error_rewrite(e, 'typing')
  File "/home/xavi/miniconda3/envs/samap/lib/python3.7/site-packages/numba/core/dispatcher.py", line 357, in error_rewrite
    raise e.with_traceback(None)
numba.core.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Internal error at <numba.core.typeinfer.IntrinsicCallConstraint object at 0x7fcfefb4eb50>.
Failed in nopython mode pipeline (step: nopython mode backend)
Cannot cast float64 to [unichr x 4]: %".21" = load double, double* %"key"

File "../../../miniconda3/envs/samap/lib/python3.7/site-packages/numba/typed/dictobject.py", line 732:
    def impl(d, key):
        castedkey = _cast(key, keyty)
        ^

During: lowering "$8call_function.3 = call $2load_global.0(key, $6load_deref.2, func=$2load_global.0, args=[Var(key, dictobject.py:732), Var($6load_deref.2, dictobject.py:732)], kws=(), vararg=None)" at /home/xavi/miniconda3/envs/samap/lib/python3.7/site-packages/numba/typed/dictobject.py (732)
During: typing of intrinsic-call at /home/xavi/miniconda3/envs/samap/lib/python3.7/site-packages/samap/mapping.py (1476)
Enable logging at debug level for details.

File "../../../miniconda3/envs/samap/lib/python3.7/site-packages/samap/mapping.py", line 1476:
def _refine_corr_kernel(p, ps, sids, sixs, indptr,indices,data, n, corr_mode):
    <source elided>
        a1, a2 = ps1[j], ps2[j]
        ix1 = d[a1]
        ^

Relevant library versions (from conda, version 23.1.0):

python                    3.7.10          hffdb5ce_100_cpython    conda-forge
sam-algorithm             0.8.5                    pypi_0    pypi
samap                     1.0.13                   pypi_0    pypi
scanpy                    1.7.2                    pypi_0    pypi
numba                     0.52.0                   pypi_0    pypi

Happy to provide more info if needed.

Cheers

xgrau commented 1 year ago

For what's worth, I'm finding the same error with more recent numba and python versions (from conda, version 23.1.0):

python                    3.9.12               h12debd9_0  
sam-algorithm             1.0.2                    pypi_0    pypi
samap                     1.0.14                   pypi_0    pypi
scanpy                    1.9.3                    pypi_0    pypi
numba                     0.56.3           py39h417a72b_0  
Calculating gene-gene correlations in the homology graph...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 47, in <module>
  File "/home/xavi/miniconda3/lib/python3.9/site-packages/samap/mapping.py", line 298, in run
    smap.run(
  File "/home/xavi/miniconda3/lib/python3.9/site-packages/samap/mapping.py", line 728, in run
    gnnmu = self.refine_homology_graph(ncpus = ncpus,  NCLUSTERS = NCLUSTERS,  THR=THR, corr_mode=corr_mode)
  File "/home/xavi/miniconda3/lib/python3.9/site-packages/samap/mapping.py", line 655, in refine_homology_graph
    gnnmu = _refine_corr(
  File "/home/xavi/miniconda3/lib/python3.9/site-packages/samap/mapping.py", line 1006, in _refine_corr
    gnnm2_sub = _refine_corr_parallel(
  File "/home/xavi/miniconda3/lib/python3.9/site-packages/samap/mapping.py", line 1560, in _refine_corr_parallel
    vals = _refine_corr_kernel(p,ps,sidss,sixs,Xavg.indptr,Xavg.indices,Xavg.data,Xavg.shape[0], corr_mode)
  File "/home/xavi/miniconda3/lib/python3.9/site-packages/numba/core/dispatcher.py", line 468, in _compile_for_args
    error_rewrite(e, 'typing')
  File "/home/xavi/miniconda3/lib/python3.9/site-packages/numba/core/dispatcher.py", line 409, in error_rewrite
    raise e.with_traceback(None)
numba.core.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Failed in nopython mode pipeline (step: native lowering)
Cannot cast float64 to [unichr x 4]: double %"arg.key"
During: lowering "castedkey = call $2load_global.0(key, $6load_deref.2, func=$2load_global.0, args=[Var(key, dictobject.py:757), Var($6load_deref.2, dictobject.py:757)], kws=(), vararg=None, varkwarg=None, target=None)" at /home/xavi/miniconda3/lib/python3.9/site-packages/numba/typed/dictobject.py (757)
During: typing of intrinsic-call at /home/xavi/miniconda3/lib/python3.9/site-packages/samap/mapping.py (1473)

File "../../../miniconda3/lib/python3.9/site-packages/samap/mapping.py", line 1473:
def _refine_corr_kernel(p, ps, sids, sixs, indptr,indices,data, n, corr_mode):
    <source elided>
        a1, a2 = ps1[j], ps2[j]
        ix1 = d[a1]
        ^
xgrau commented 1 year ago

I've encountered the same error with these library versions, installed in an older conda environment (4.12.0):

python                    3.7.12          hb7a2778_100_cpython    conda-forge
sam-algorithm             1.0.2                    pypi_0    pypi
samap                     1.0.3                    pypi_0    pypi
scanpy                    1.8.2                    pypi_0    pypi
numba                     0.52.0                   pypi_0    pypi
xgrau commented 1 year ago

I think I've found a workaround: removing (or commenting out) the @njit(parallel=True) line in the _refine_corr_kernel of the mapping.py script avoids the use of no-python code (via numba), which circumvents the error.

python                    3.9.12               h12debd9_0  
sam-algorithm             1.0.2                    pypi_0    pypi
samap                     1.0.14                   pypi_0    pypi
scanpy                    1.9.3                    pypi_0    pypi
numba                     0.56.3           py39h417a72b_0  
atarashansky commented 1 year ago

Just to confirm @xgrau and @mikepassal - you are encountering the same error when running through the tutorial notebook?

atarashansky commented 1 year ago

I think I've found a workaround: removing (or commenting out) the @njit(parallel=True) line in the _refine_corr_kernel of the mapping.py script avoids the use of no-python code (via numba), which circumvents the error.

python                    3.9.12               h12debd9_0  
sam-algorithm             1.0.2                    pypi_0    pypi
samap                     1.0.14                   pypi_0    pypi
scanpy                    1.9.3                    pypi_0    pypi
numba                     0.56.3           py39h417a72b_0  

This will circumvent any numba-related errors at the (typically enormous) cost of speed.

In a fresh conda environment, with pip install samap jupyter, I'm able to run through the notebook with no problems. Here are my package versions (see below). I'm on an M1 macbook.

What OS are you guys running?

Package                  Version
------------------------ ---------
anndata                  0.8.0
anyio                    3.6.2
appnope                  0.1.3
argon2-cffi              21.3.0
argon2-cffi-bindings     21.2.0
arrow                    1.2.3
asttokens                2.2.1
attrs                    22.2.0
backcall                 0.2.0
beautifulsoup4           4.12.0
bleach                   6.0.0
bokeh                    2.4.3
certifi                  2022.12.7
cffi                     1.15.1
charset-normalizer       3.1.0
colorcet                 3.0.1
comm                     0.1.3
contourpy                1.0.7
cycler                   0.11.0
debugpy                  1.6.6
decorator                5.1.1
defusedxml               0.7.1
dill                     0.3.6
executing                1.2.0
fast-histogram           0.11
fastjsonschema           2.16.3
fonttools                4.39.2
fqdn                     1.5.1
h5py                     3.8.0
harmonypy                0.0.9
hnswlib                  0.7.0
holoviews-samap          1.0.1
idna                     3.4
igraph                   0.10.4
importlib-metadata       6.1.0
importlib-resources      5.12.0
ipykernel                6.22.0
ipython                  8.11.0
ipython-genutils         0.2.0
ipywidgets               8.0.5
isoduration              20.11.0
jedi                     0.18.2
Jinja2                   3.1.2
joblib                   1.2.0
jsonpointer              2.3
jsonschema               4.17.3
jupyter                  1.0.0
jupyter_client           8.1.0
jupyter-console          6.6.3
jupyter_core             5.3.0
jupyter-events           0.6.3
jupyter_server           2.5.0
jupyter_server_terminals 0.4.4
jupyterlab-pygments      0.2.2
jupyterlab-widgets       3.0.6
kiwisolver               1.4.4
leidenalg                0.9.1
llvmlite                 0.39.1
Markdown                 3.4.1
MarkupSafe               2.1.2
matplotlib               3.7.1
matplotlib-inline        0.1.6
mistune                  2.0.5
natsort                  8.3.1
nbclassic                0.5.3
nbclient                 0.7.2
nbconvert                7.2.10
nbformat                 5.8.0
nest-asyncio             1.5.6
networkx                 3.0
notebook                 6.5.3
notebook_shim            0.2.2
numba                    0.56.4
numpy                    1.23.5
packaging                23.0
pandas                   1.5.3
pandocfilters            1.5.0
panel                    0.14.4
param                    1.13.0
parso                    0.8.3
patsy                    0.5.3
pexpect                  4.8.0
pickleshare              0.7.5
Pillow                   9.4.0
pip                      23.0.1
platformdirs             3.1.1
prometheus-client        0.16.0
prompt-toolkit           3.0.38
psutil                   5.9.4
ptyprocess               0.7.0
pure-eval                0.2.2
pycparser                2.21
pyct                     0.5.0
Pygments                 2.14.0
pynndescent              0.5.8
pyparsing                3.0.9
pyrsistent               0.19.3
python-dateutil          2.8.2
python-json-logger       2.0.7
pytz                     2022.7.1
pyviz-comms              2.2.1
PyYAML                   6.0
pyzmq                    25.0.2
qtconsole                5.4.1
QtPy                     2.3.0
requests                 2.28.2
rfc3339-validator        0.1.4
rfc3986-validator        0.1.1
sam-algorithm            1.0.2
samap                    1.0.14
scanpy                   1.9.3
scikit-learn             1.2.2
scipy                    1.10.1
seaborn                  0.12.2
Send2Trash               1.8.0
session-info             1.0.0
setuptools               65.6.3
six                      1.16.0
sniffio                  1.3.0
soupsieve                2.4
stack-data               0.6.2
statsmodels              0.13.5
stdlib-list              0.8.0
terminado                0.17.1
texttable                1.6.7
threadpoolctl            3.1.0
tinycss2                 1.2.1
tornado                  6.2
tqdm                     4.65.0
traitlets                5.9.0
typing_extensions        4.5.0
umap-learn               0.5.3
uri-template             1.2.0
urllib3                  1.26.15
wcwidth                  0.2.6
webcolors                1.12
webencodings             0.5.1
websocket-client         1.5.1
wheel                    0.38.4
widgetsnbextension       4.0.6
zipp                     3.15.0
atarashansky commented 1 year ago

Can you try installing from source with the most recent commit?

I pinned the package dependencies and confirmed that it works in both linux and macbook m1.

If this solves your numba problem (and hopefully the other issue @mikepassal has), then I'll go ahead and update PyPi!

mikepassal commented 1 year ago

@atarashansky Sorry for the delayed response! I will give this a shot this week, hopefully on Wednesday, and let you know if its working.

I'm running on CentOS.

mikepassal commented 1 year ago

Hi @atarashansky ! Thanks for your patience and the time you put in maintaining this! I installed the new version on a clean Conda enviorment with Python 3.10 and still hit an error in sm.run(). The error is different this time. I've copy and pasted it below, and added a screenshot. Let me know if I can provide any other information.

Cell In[9], line 1 ----> 1 sm.run() 2 samap = sm.samap # SAM object with three species stitched together

File ~/miniconda3/envs/SAMap_fix_attempt/lib/python3.10/site-packages/samap/mapping.py:298, in SAMAP.run(self, NUMITERS, NHS, crossK, N_GENE_CHUNKS, umap, ncpus, hom_edge_thr, hom_edge_mode, scale_edges_by_corr, neigh_from_keys, pairwise) 294 neigh_from_keys[sid] = False 296 start_time = time.time() --> 298 smap.run( 299 NUMITERS=NUMITERS, 300 NHS=NHS, 301 K=crossK, 302 NCLUSTERS=N_GENE_CHUNKS, 303 ncpus=ncpus, 304 THR=hom_edge_thr, 305 corr_mode=hom_edge_mode, 306 scale_edges_by_corr = scale_edges_by_corr, 307 neigh_from_keys=neigh_from_keys, 308 pairwise=pairwise 309 ) 310 samap = smap.final_sam 311 self.samap = samap

File ~/miniconda3/envs/SAMap_fix_attempt/lib/python3.10/site-packages/samap/mapping.py:728, in _Samap_Iter.run(self, NUMITERS, NHS, K, corr_mode, NCLUSTERS, scale_edges_by_corr, THR, neigh_from_keys, pairwise, ncpus) 726 print("Calculating gene-gene correlations in the homology graph...") 727 self.samap = sam4 --> 728 gnnmu = self.refine_homology_graph(ncpus = ncpus, NCLUSTERS = NCLUSTERS, THR=THR, corr_mode=corr_mode) 730 self.GNNMS_corr.append(gnnmu) 731 self.gnnmu = gnnmu

File ~/miniconda3/envs/SAMap_fix_attempt/lib/python3.10/site-packages/samap/mapping.py:655, in _Samap_Iter.refine_homology_graph(self, NCLUSTERS, ncpus, THR, corr_mode, wscale) 652 keys = self.keys 653 sam4 = self.samap --> 655 gnnmu = _refine_corr( 656 sams, 657 sam4, 658 gnnm, 659 gns_dict, 660 THR=THR, 661 use_seq=False, 662 T1=0, 663 NCLUSTERS=NCLUSTERS, 664 ncpus=ncpus, 665 corr_mode=corr_mode, 666 wscale=wscale 667 ) 668 return gnnmu

File ~/miniconda3/envs/SAMap_fix_attempt/lib/python3.10/site-packages/samap/mapping.py:1006, in _refine_corr(sams, st, gnnm, gns_dict, corr_mode, THR, use_seq, T1, NCLUSTERS, ncpus, wscale) 1003 gn = gns_dict[sid] 1004 gns_dict_sub[sid] = gn[np.in1d(gn,gnsub)] -> 1006 gnnm2_sub = _refine_corr_parallel( 1007 sams, 1008 st, ...

    a1, a2 = ps1[j], ps2[j]
    ix1 = d[a1]
    ^

image

atarashansky commented 1 year ago

Alright, I'm going to have to try reproducing this error in a centOS instance... I'll keep you posted.

mikepassal commented 1 year ago

Thanks, I appreciate the help. Hope you have a good week :)

jordan841220 commented 2 months ago

I have encountered the similar issue and managed to solve it.

For some reasons, in one species, if most of the gene IDs in the FASTA and h5ad were not identical (i.e., the number of gene symbols match between the datasets and the BLAST graph is small), the error occurred. So you might want to check the gene IDs.