vtraag / leidenalg

Implementation of the Leiden algorithm for various quality functions to be used with igraph in Python.
GNU General Public License v3.0
596 stars 78 forks source link

vector exception in MutableVertexPartition.cpp? #62

Closed brgew closed 3 years ago

brgew commented 3 years ago

Hi,

I corresponded with you several years ago when I was writing an R interface (leidenbase) to your excellent leiden community detection algorithms and functions. Several users reported recently crashes on CentOS 8 systems. This is similar to the issue # 12 on github

https://github.com/vtraag/leidenalg/issues/12

that is, an exception interrupts execution with the message+crash

/usr/include/c++/9/bits/stl_vector.h:1042: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = double; _Alloc = std::allocator<double>; std::vector<_Tp, _Alloc>::reference = double&; std::vector<_Tp, _Alloc>::size_type = long unsigned int]: Assertion '__builtin_expect(__n < this->size(), true)' failed.

I tend to write narratives but I'll try to keep to the essentials for your sake so I omit the multitude of dead-end paths that I explored until I became more confident that my code is not the immediate problem.

I reproduced the problem on a R version 4.0.2 CentOS Linux release 8.2.2004 and discovered that the file

/usr/lib64/R/etc/Makeconf

has C*FLAGS compiler options that include '-Wp,-D_GLIBCXX_ASSERTIONS -fexceptions'. R on a Debian buster system initially runs without error but adding these options to the /usr/lib/R/etc/Makeconf C*FLAGS variables (and reinstalling leidenbase) causes the same crash as on CentOS 8.

I gathered just enough confidence to submit this issue when I found that I can get the same crash by ad edgelist.edg.gz d resolution_parameter=0.5 to leidenalg.find_partition in the following leidenalg-based python program (when run on CentOS 8, which I hoped uses the '-Wp,-D_GLIBCXX_ASSERTIONS -fexceptions' compiler options):

#!/usr/bin/env python3

import sys
import platform
import leidenalg
import igraph as ig

print('python version info: %s' % ( platform.python_version() ) )
print('leidenalg version: %s' % ( leidenalg.__version__ ) )

g = ig.read( filename='edgelist.edg', format='edgelist')

part = leidenalg.find_partition(g, partition_type=leidenalg.CPMVertexPartition, n_iterations=2, resolution_parameter=0.5)
print(part)

The output is

python version info: 3.6.8
leidenalg version: 0.8.3
/usr/include/c++/8/bits/stl_vector.h:932: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = double; _Alloc = std::allocator<double>; std::vector<_Tp, _Alloc>::reference = double&; std::vector<_Tp, _Alloc>::size_type = long unsigned int]: Assertion '__builtin_expect(__n < this->size(), true)' failed.
Aborted (core dumped)

I don't know how to set C*FLAGS for python package installation, which is why I ran this test program on the CentOS 8 system. (If you know how to set those flags and are willing to share the information, I would appreciate it greatly.) I attach the edgelist file, in case it may be of use to you.

Somewhere along the line of investigation, I returned to the Debian system with the modified C*FLAGS Makeconf variables (and removed the optimization flags) and ran R with the debugger. The stack dump after crashing is

#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007ffff793b535 in __GI_abort () at abort.c:79
#2  0x00007fffe78668e9 in std::__replacement_assert (__file=0x7fffe7c2d258 "/usr/include/c++/8/bits/stl_vector.h", __line=932,
    __function=0x7fffe7c2d500 <std::vector<double, std::allocator<double> >::operator[](unsigned long)::__PRETTY_FUNCTION__> "std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = double; _Alloc = std::allocator<double>; std::vector<_Tp, _Alloc>::reference ="..., 
    __condition=0x7fffe7c2d228 "__builtin_expect(__n < this->size(), true)") at /usr/include/x86_64-linux-gnu/c++/8/bits/c++config.h:447
#3  0x00007fffe79b1289 in std::vector<double, std::allocator<double> >::operator[] (this=0x555558a1f560, __n=190) at /usr/include/c++/8/bits/stl_vector.h:932
#4  0x00007fffe7bc99c5 in MutableVertexPartition::cache_neigh_communities (this=0x555558a1f410, v=452, mode=IGRAPH_ALL) at leidenalg/src/MutableVertexPartition.cpp:815
#5  0x00007fffe7bc9c91 in MutableVertexPartition::get_neigh_comms (this=0x555558a1f410, v=452, mode=IGRAPH_ALL) at leidenalg/src/MutableVertexPartition.cpp:880
#6  0x00007fffe7bd1a89 in Optimiser::move_nodes (this=0x7ffffffb88c0, partitions=std::vector of length 1, capacity 1 = {...},
    layer_weights=std::vector of length 1, capacity 1 = {...}, is_membership_fixed=std::vector<bool> of length 1500, capacity 1536 = {...}, consider_comms=2, 
    consider_empty_community=1, renumber_fixed_nodes=false, max_comm_size=0) at leidenalg/src/Optimiser.cpp:595
#7  0x00007fffe7bcf2cf in Optimiser::optimise_partition (this=0x7ffffffb88c0, partitions=std::vector of length 1, capacity 1 = {...},
    layer_weights=std::vector of length 1, capacity 1 = {...}, is_membership_fixed=std::vector<bool> of length 1500, capacity 1536 = {...}, max_comm_size=0)
    at leidenalg/src/Optimiser.cpp:159
#8  0x00007fffe7bcec25 in Optimiser::optimise_partition (this=0x7ffffffb88c0, partition=0x555558a1f410,
    is_membership_fixed=std::vector<bool> of length 1500, capacity 1536 = {...}, max_comm_size=0) at leidenalg/src/Optimiser.cpp:71
#9  0x00007fffe7bceafe in Optimiser::optimise_partition (this=0x7ffffffb88c0, partition=0x555558a1f410,
    is_membership_fixed=std::vector<bool> of length 1500, capacity 1536 = {...}) at leidenalg/src/Optimiser.cpp:63
#10 0x00007fffe7bcea61 in Optimiser::optimise_partition (this=0x7ffffffb88c0, partition=0x555558a1f410) at leidenalg/src/Optimiser.cpp:58
#11 0x00007fffe7bb8be2 in leidenFindPartition (pigraph=0x7ffffffb8b20, partitionType="CPMVertexPartition", pinitialMembership=0x0, pedgeWeights=0x0, pnodeSizes=0x0,
    seed=123456, resolutionParameter=0.5, numIter=2, pmembership=0x7ffffffb8aa0, pweightInCommunity=0x7ffffffb8ac0, pweightFromCommunity=0x7ffffffb8ae0,
    pweightToCommunity=0x7ffffffb8b00, pweightTotal=0x7ffffffb89f0, pquality=0x7ffffffb89f8, pmodularity=0x7ffffffb8a00, psignificance=0x7ffffffb8a08, pstatus=0x7ffffffb89e8)
    at leidenFindPartition.cpp:207
#12 0x00007fffe7bbc124 in _leiden_find_partition (igraph=0x55555b9a3998, partition_type=0x55555b5b3ad0, initial_membership=0x555555574590, edge_weights=0x555555574590,
    node_sizes=0x555555574590, seed=0x55555a244f98, resolution_parameter=0x55555a245008, num_iter=0x55555ae0f1e0) at leidenFindPartitionR2C.cpp:184
#13 0x00007ffff7c26262 in ?? () from /usr/lib/R/lib/libR.so
#14 0x00007ffff7c26815 in ?? () from /usr/lib/R/lib/libR.so
#15 0x00007ffff7c718d8 in Rf_eval () from /usr/lib/R/lib/libR.so
#16 0x00007ffff7c76479 in ?? () from /usr/lib/R/lib/libR.so
.
.
.

It looks like the problem is detected somewhere in the system libraries after executing

#4 0x00007fffe7bc99c5 in MutableVertexPartition::cache_neigh_communities (this=0x555558a1f410, v=452, mode=IGRAPH_ALL) at leidenalg/src/MutableVertexPartition.cpp:815

  // Reset cached communities
  for (size_t c : *_cached_neighs_comms)
       (*_cached_weight_tofrom_community)[c] = 0;

I hope that I am not misleading you and myself when I submit this issue.

As an aside, I notice that you contributed leiden-related functions to igraph and rigraph. Do you consider those to be ready for 'production' use? If so, I may consider using those rather than leidenbase.

Ever grateful, Brent

vtraag commented 3 years ago

Thanks for the report! I am able to reproduce the problem locally, so I will investigate further. You can compile the Python package by just setting C(PP)FLAGS environment variables as normally done.

As an aside, I notice that you contributed leiden-related functions to igraph and rigraph. Do you consider those to be ready for 'production' use? If so, I may consider using those rather than leidenbase.

Yes, they are ready for use, but the implementation is much less flexible than what is provided in this package. Nonetheless, it might just cover 80% of the use cases of people.

I must admit that I forgot about leidenbase, @evanbiederstedt also developed an R package to interface with leidenalg and @TomKellyGenetics developed a R package based on reticulate, see https://github.com/vtraag/leidenalg/issues/59 for some discussion. Perhaps it would be an opportunity to join forces and make a common supported implementation available that exposes all the functionality in the leidenalg package in a native R package, including the multiplex support?

vtraag commented 3 years ago

This is now solved in https://github.com/vtraag/leidenalg/commit/c6ea5e44fe11ae6f18434abbc84cd2056f383034, feel free to pull in the latest version from the master branch and retry. I will fix some of the other open issues and then make a new release.

brgew commented 3 years ago

Hi,

I appreciate your responses about the reproducibility with your reassurance, and the compiler information. I offer a large basket of thanks!

Ever grateful, Brent

vtraag commented 3 years ago

It seems there was still a problem with the way the caching was cleared, this should now be fixed in https://github.com/vtraag/leidenalg/commit/0947e632724f089f1f6d0ca8b6aaa5e7d3766ec5.