theislab / scib

Benchmarking analysis of data integration tools
MIT License
294 stars 63 forks source link

Error reading LISI index file #376

Closed lazappi closed 1 year ago

lazappi commented 1 year ago

I'm getting the following error when running LISI metrics:

Error message ```python Command error: Traceback (most recent call last): File "/lustre/groups/ml01/code/luke.zappia/atlas-feature-selection-benchmark/bin/metric-iLISI.py", line 79, in main() File "/lustre/groups/ml01/code/luke.zappia/atlas-feature-selection-benchmark/bin/metric-iLISI.py", line 68, in main score = calculate_iLISI(input) File "/lustre/groups/ml01/code/luke.zappia/atlas-feature-selection-benchmark/bin/metric-iLISI.py", line 34, in calculate_iLISI score = ilisi_graph( File "/home/icb/luke.zappia/nf-conda-envs/atlas-feature-selection/atlas-feature-selection-scib-9e16a1f74890703dcd60e998efef51b0/lib/python3.9/site-packages/scib/metrics/lisi.py", line 100, in ilisi_graph ilisi_score = lisi_graph_py( File "/home/icb/luke.zappia/nf-conda-envs/atlas-feature-selection/atlas-feature-selection-scib-9e16a1f74890703dcd60e998efef51b0/lib/python3.9/site-packages/scib/metrics/lisi.py", line 331, in lisi_graph_py simpson_estimate_batch = compute_simpson_index_graph( File "/home/icb/luke.zappia/nf-conda-envs/atlas-feature-selection/atlas-feature-selection-scib-9e16a1f74890703dcd60e998efef51b0/lib/python3.9/site-packages/scib/metrics/lisi.py", line 457, in compute_simpson_index_graph indices = pd.read_table(index_file, index_col=0, header=None, sep=",") File "/home/icb/luke.zappia/nf-conda-envs/atlas-feature-selection/atlas-feature-selection-scib-9e16a1f74890703dcd60e998efef51b0/lib/python3.9/site-packages/pandas/util/_decorators.py", line 311, in wrapper return func(*args, **kwargs) File "/home/icb/luke.zappia/nf-conda-envs/atlas-feature-selection/atlas-feature-selection-scib-9e16a1f74890703dcd60e998efef51b0/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 779, in read_table return _read(filepath_or_buffer, kwds) File "/home/icb/luke.zappia/nf-conda-envs/atlas-feature-selection/atlas-feature-selection-scib-9e16a1f74890703dcd60e998efef51b0/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 581, in _read return parser.read(nrows) File "/home/icb/luke.zappia/nf-conda-envs/atlas-feature-selection/atlas-feature-selection-scib-9e16a1f74890703dcd60e998efef51b0/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 1255, in read index, columns, col_dict = self._engine.read(nrows) File "/home/icb/luke.zappia/nf-conda-envs/atlas-feature-selection/atlas-feature-selection-scib-9e16a1f74890703dcd60e998efef51b0/lib/python3.9/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 225, in read chunks = self._reader.read_low_memory(nrows) File "pandas/_libs/parsers.pyx", line 805, in pandas._libs.parsers.TextReader.read_low_memory File "pandas/_libs/parsers.pyx", line 861, in pandas._libs.parsers.TextReader._read_rows File "pandas/_libs/parsers.pyx", line 847, in pandas._libs.parsers.TextReader._tokenize_rows File "pandas/_libs/parsers.pyx", line 1960, in pandas._libs.parsers.raise_parser_error pandas.errors.ParserError: Error tokenizing data. C error: Expected 69 fields in line 70, saw 74 /home/icb/luke.zappia/nf-conda-envs/atlas-feature-selection/atlas-feature-selection-scib-9e16a1f74890703dcd60e998efef51b0/lib/python3.9/tempfile.py:821: ResourceWarning: Implicitly cleaning up _warnings.warn(warn_message, ResourceWarning) ```

The interesting part is pandas.errors.ParserError: Error tokenizing data. C error: Expected 69 fields in line 70, saw 74 when trying to read the index file in compute_simpson_index_graph(). I have had a look at one of the index files (here index_file.txt) and at some point the rows start having additional columns which I think is the cause of the error. It's possible it has something to do with the specific dataset I'm using (it's a weird test simulation) but would be good to find a fix for this.

lazappi commented 1 year ago

Actually, this is the same error as #374. I'll leave this open for now, but feel free to close it if you like.

mumichae commented 1 year ago

Yes, I'll close this issue and continue the conversation in the other issue