theislab / scib

Benchmarking analysis of data integration tools
MIT License
294 stars 63 forks source link

FileNotFoundError: LISI graph fails at compute_simpson_index_graph #308

Closed mbuttner closed 2 years ago

mbuttner commented 2 years ago

Hi there, I installed scib in a Docker container and tested the different metrics calls on the human_pancreas dataset (downloaded from figshare). scib.__version__ '1.0.2' Running scib.metrics.lisi_graph(adata = adata, batch_key= 'tech', label_key= 'celltype') gives:


/opt/python/lib/python3.8/site-packages/scib/knn_graph/knn_graph.o: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by /opt/python/lib/python3.8/site-packages/scib/knn_graph/knn_graph.o)
/opt/python/lib/python3.8/site-packages/scib/knn_graph/knn_graph.o: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /opt/python/lib/python3.8/site-packages/scib/knn_graph/knn_graph.o)
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
/tmp/ipykernel_2892/3558402724.py in <module>
----> 1 scib.metrics.lisi_graph(adata = adata, batch_key= 'tech', label_key= 'celltype')

/opt/python/lib/python3.8/site-packages/scib/metrics/lisi.py in lisi_graph(adata, batch_key, label_key, **kwargs)
     44     :return: Overall cLISI and iLISI scores
     45     """
---> 46     ilisi = ilisi_graph(adata, batch_key=batch_key, **kwargs)
     47     clisi = clisi_graph(adata, batch_key=batch_key, label_key=label_key, **kwargs)
     48     return ilisi, clisi

/opt/python/lib/python3.8/site-packages/scib/metrics/lisi.py in ilisi_graph(adata, batch_key, k0, type_, subsample, scale, n_cores, verbose)
     84 
     85     adata_tmp = recompute_knn(adata, type_)
---> 86     ilisi_score = lisi_graph_py(
     87         adata=adata_tmp,
     88         batch_key=batch_key,

/opt/python/lib/python3.8/site-packages/scib/metrics/lisi.py in lisi_graph_py(adata, batch_key, n_neighbors, perplexity, subsample, n_cores, verbose)
    294 
    295     else:
--> 296         simpson_estimate_batch = compute_simpson_index_graph(
    297             file_prefix=prefix,
    298             batch_labels=batch,

/opt/python/lib/python3.8/site-packages/scib/metrics/lisi.py in compute_simpson_index_graph(file_prefix, batch_labels, n_batches, n_neighbors, perplexity, chunk_no, tol)
    412 
    413     # check if the target file is not empty
--> 414     if os.stat(index_file).st_size == 0:
    415         print("File has no entries. Doing nothing.")
    416         lists = np.zeros(0)

FileNotFoundError: [Errno 2] No such file or directory: '/tmp/lisi_dgim5fsm/graph_lisi_indices_0.txt'
​```

The file in my `tmp` folder is called `/tmp/lisi_dgim5fsm/graph_lisi_input.mtx`, but without the `chunk_id` 0 as required in the parallelized version of LISI. 
mumichae commented 2 years ago

Hi @mbuttner, the error is likely caused because the knn graph .o file was compiled with a newer version of GCC than is supported in the docker container. If you recompile the package on your system, e.g. install through git, the file should be recompiled and solve the issue.

Newer scib versions will be built in Github Actions on the latest Ubuntu version (currently 20.04), so this error is less likely to occur.

poseidonchan commented 2 years ago

Thanks for your discussion, I have fixed this problem by installing through git and compiling knn_graph.cpp manually.