Open patrick-nicodemus opened 1 month ago
Ran into this issue as well, the scanpy
function just builds a np.random.RandomState
and passes that to igraph.set_random_number_generator
. So thinking the issue is in igraph
(more where I think this issue is at the bottom). Also, the algorithm converges if you wait long enough for all the messages to print to the output stream... which could take a long time if you don't have a ton of compute resources.
That said, I did the following to get around it:
import numpy as np
class RandomState(np.random.RandomState):
def randint(self, *args, **kwargs):
args = list(args)
args[1] = 2**(32-1)
return super().randint(*args, **kwargs)
rs = RandomState(np.random.MT19937(np.random.SeedSequence(0)))
Then passed rs
into the random_seed
argument of scanpy
, which is passed to igraph.set_random_number_generator
.
Basically, changing the max argument for the random number generator to the max signed int
. I think numpy
gets the default int
bit length from the OS C
implementation of long
, which I also found is 32 on windows and 64 on linux. I think a newer implementation of numpy
resolves this, but does not appear to fix the problem here, at least according to another comment on the related issue opened in scanpy
.
Noticed a few other things on the way to this which may help the developers, first RNG_BITS
is defined as 32 here and in this line the comment indicates that they are passing randint(0, 2 ^ RNG_BITS-1)
, which I am wondering if this should be randint(0, 2 ^ (RNG_BITS-1))
since int
is signed 32bit in windows numpy
? I don't know C
so I can't tell if just the comment was misleading or not. That said, this would also indicate why it works on other OSs; since the random generator default data type is int64
vs int32
.
Describe the bug This is a cross-reference of an existing bug already filed with scanpy developers, https://github.com/scverse/scanpy/issues/2969.
When I run scanpy on Windows 11 with the Leiden clustering algorithm, it freezes with the following error message:
The exception is raised by the C core function
GraphBase.community_leiden
but it is not clear to me whether the bug is actually in the C core, or rather scanpy or the Python igraph layer feeding incorrect arguments or parameters. I posted it here as I guessed that the igraph devs would be able to identify whether the bug is in igraph or whether scanpy is passing inappropriate arguments to the igraph core routine or layer.To reproduce Install scanpy on Windows 11 and run the following.
Version information Which version of
python-igraph
are you using and where did you obtain it? I am using version 0.11.6, it was installed viapip install igraph
.I checked using a Windows docker image to make it as reproducible as possible.
These last five lines repeat in a loop until the user terminates the shell with Ctrl-C.
I notice that the igraph wheel downloaded with pip has "cp39" in the filename, which is surprising as this is Python 3.12.