limix / bgen-reader-py

A BGEN file format reader.
MIT License
10 stars 3 forks source link

sporadic error when reading in genotypes #13

Closed arminschoech closed 5 years ago

arminschoech commented 5 years ago

Hi Danilo,

Thanks for the great package! I have run into one problem though: when reading in genotype probabilities with "x = bgen['genotype'][:100].compute()" it sometimes works perfectly and sometimes I get an error message (see below). This happens if I run this command multiple consecutive times on the exact same set of SNPs. Can you think of any explanation of this behavior? Does the function use any stochastic methods that could lead to the exact same command failing sometimes while working otherwise?

Best wishes, Armin

x1 = bgen['genotype'][:100].compute() x1 = bgen['genotype'][:100].compute() Traceback (most recent call last): File "", line 1, in File "/.../python2.7/site-packages/dask/base.py", line 156, in compute (result,) = compute(self, traverse=False, kwargs) File "/.../python2.7/site-packages/dask/base.py", line 395, in compute results = schedule(dsk, keys, kwargs) File "/.../python2.7/site-packages/dask/threaded.py", line 75, in get pack_exception=pack_exception, *kwargs) File "/.../python2.7/site-packages/dask/local.py", line 501, in get_async raise_exception(exc, tb) File "/.../python2.7/site-packages/dask/local.py", line 272, in execute_task result = _execute_task(task, data) File "/.../python2.7/site-packages/dask/local.py", line 252, in _execute_task args2 = [_execute_task(a, cache) for a in args] File "/.../python2.7/site-packages/dask/local.py", line 253, in _execute_task return func(args2) File "/.../python2.7/site-packages/bgen_reader/_reader.py", line 192, in Gcall = delayed(lambda args: _genotype_block(args)[0], *kws) File "/.../python2.7/site-packages/bgen_reader/_pylru.py", line 569, in wrapper self.cache[key] = value File "/.../python2.7/site-packages/bgen_reader/_pylru.py", line 141, in setitem del self.table[node.key] KeyError: ((<cdata 'struct bgen_vi *' owning 8 bytes>, 487409, 26, 13), ()) x1 = bgen['genotype'][:100].compute() x1 = bgen['genotype'][:100].compute()

horta commented 5 years ago

Hi @arminschoech

Thanks for the precise feedback =)

Could you show here what does print(bgen['genotype'].chunks) print?

horta commented 5 years ago

I'm releasing the version 2.0.8. Please, git it a try. I think I solved the problem.

arminschoech commented 5 years ago

Yes, the problem seems to have gone away with 2.0.8. Thank you very much!