linbox-team / linbox

LinBox - C++ library for exact, high-performance linear algebra
https://linbox-team.github.io/linbox
GNU Lesser General Public License v2.1
83 stars 28 forks source link

MatrixHom::map incorrectly copies sparse ZeroOne<GF2> matrices whose first row is empty #280

Closed collares closed 2 years ago

collares commented 2 years ago

I am getting the following stack trace intermittently (on 1.6.3) when running test-local-smith-form-sparseelim:

(gdb) bt
#0  0x00000000004b20cb in LinBox::GaussDomain<LinBox::GF2>::InPlaceLinearPivoting<LinBox::ZeroOne<LinBox::GF2>, LinBox::Permutation<LinBox::GF2, LinBox::BlasMatrix<LinBox::GF2, LinBox::BitVector> > > (this=0x7fffffff6c27, 
    Rank=@0x7fffffff6de0: 7301224, determinant=@0x7fffffff6b6c: true, LigneA=..., P=..., Ni=35, Nj=38) at ../linbox/algorithms/gauss/gauss-gf2.inl:79
#1  0x00000000004987c2 in LinBox::GaussDomain<LinBox::GF2>::rankInPlace<LinBox::ZeroOne<LinBox::GF2> > (this=0x7fffffff6c27, Rank=@0x7fffffff6de0: 7301224, A=..., Ni=35, Nj=38, reord=LinBox::PivotStrategy::Linear)
    at ../linbox/algorithms/gauss/gauss-rank-gf2.inl:50
#2  0x000000000048a789 in LinBox::GaussDomain<LinBox::GF2>::rankInPlace<LinBox::ZeroOne<LinBox::GF2> > (this=0x7fffffff6c27, Rank=@0x7fffffff6de0: 7301224, A=..., reord=LinBox::PivotStrategy::Linear)
    at ../linbox/algorithms/gauss/gauss-rank-gf2.inl:59
#3  0x000000000047fdfd in LinBox::rankInPlace (r=@0x7fffffff6de0: 7301224, A=...) at ../linbox/solutions/rank.inl:593
#4  0x000000000047fe66 in LinBox::rankInPlace (r=@0x7fffffff6de0: 7301224, A=..., M=...) at ../linbox/solutions/rank.inl:604
#5  0x00000000004dae95 in LinBox::rank<LinBox::ZeroOne<LinBox::GF2>, LinBox::RingCategories::ModularTag> (r=@0x7fffffff6de0: 7301224, A=..., tag=..., M=...) at ../linbox/solutions/rank.inl:408
#6  0x00000000004b851d in LinBox::rank<LinBox::ZeroOne<LinBox::GF2>, LinBox::Method::SparseElimination> (r=@0x7fffffff6de0: 7301224, A=..., M=...) at ../linbox/solutions/rank.h:112
#7  0x000000000049ef8e in sparse_local_smith_poweroftwo<unsigned long, LinBox::SparseMatrix<Givaro::ZRing<unsigned long>, LinBox::SparseMatrixFormat::SparseSeq> > (B=..., R=23, M=35, N=38, p=@0x7fffffff6f18: 2, exp=12, map_values=...)
    at test-local-smith-form-sparseelim.C:157
#8  0x000000000048dcb0 in test_sparse_local_smith<unsigned long, unsigned long> (seed=1636900582, R=23, M=35, N=38, p=@0x7fffffff8640: 2, exp=12, density=0.29999999999999999) at test-local-smith-form-sparseelim.C:246
#9  0x000000000046f016 in main (argc=2, argv=0x7fffffff9b18) at test-local-smith-form-sparseelim.C:304
(gdb) info locals
k = 0
jj = 0
col_density = {<std::_Vector_base<unsigned long, std::allocator<unsigned long> >> = {
    _M_impl = {<std::allocator<unsigned long>> = {<__gnu_cxx::new_allocator<unsigned long>> = {<No data fields>}, <No data fields>}, <std::_Vector_base<unsigned long, std::allocator<unsigned long> >::_Vector_impl_data> = {
        _M_start = 0x6fcf90, _M_finish = 0x6fd0c0, _M_end_of_storage = 0x6fd0c0}, <No data fields>}}, <No data fields>}
last = 37
c = 140737320809216
sstep = 140737488317304
LigneA_k = 0x7fffffff6b78
card = {static zero = {static zero = <same as static member of an already seen type>, static one = {static zero = <same as static member of an already seen type>, static one = <same as static member of an already seen type>, 
      static mOne = {static zero = <same as static member of an already seen type>, static one = <same as static member of an already seen type>, static mOne = <same as static member of an already seen type>, gmp_rep = {_mp_alloc = 1, 
          _mp_size = -1, _mp_d = 0x6b9580}}, gmp_rep = {_mp_alloc = 1, _mp_size = 1, _mp_d = 0x6b9560}}, static mOne = <same as static member of an already seen type>, gmp_rep = {_mp_alloc = 1, _mp_size = 0, _mp_d = 0x6b9540}}, 
  static one = <same as static member of an already seen type>, static mOne = <same as static member of an already seen type>, gmp_rep = {_mp_alloc = 38, _mp_size = 38, _mp_d = 0x7fffffff6b70}}
(gdb) info args
this = 0x7fffffff6c27
Rank = @0x7fffffff6de0: 7301224
determinant = @0x7fffffff6b6c: true
LigneA = @0x7fffffff6c90: {<LinBox::LightContainer<LinBox::LightContainer<unsigned long> >> = {allocated = 35, _container = 0x70def8, _finish = 0x70e240}, _field = 0x7fffffff6d00, _rowdim = 35, _coldim = 38, _nnz = 320}
P = @0x7fffffff6b70: {<LinBox::FIBB<LinBox::GF2>> = {<LinBox::BB<LinBox::GF2>> = {_vptr.BB = 0x69f430 <vtable for LinBox::Permutation<LinBox::GF2, LinBox::BlasMatrix<LinBox::GF2, LinBox::BitVector> >+16>}, <No data fields>}, _indices = {
    allocated = 57, _container = 0x6f4a10, _finish = 0x6f4b40}, _field = @0x7fffffff6b6d}
Ni = 35
Nj = 38
(gdb) p LigneA[jj][(size_t)k]
$4 = (unsigned long &) @0x70e4e0: 7298067

Full logs (compiled with --enable-debug): test.log, stdout.log.

I don't know anything about this code, but it looks like MatrixHom::map is copying the sparse ZeroOne<GF2> matrix incorrectly, at least when the first row is all zeros. The gdb session below is from a different run:

(gdb) f 5
#5  0x0000000000670d10 in LinBox::rank<LinBox::ZeroOne<LinBox::GF2>, LinBox::RingCategories::ModularTag> (r=@0x7fffffff6450: 106790066977112, A=..., tag=..., M=...) at ../linbox/solutions/rank.inl:408
408         return rankInPlace(r, copyA, tag, M);
(gdb) p A
$0 = (const LinBox::ZeroOne<LinBox::GF2> &) @0x7fffffff65c0: {<LinBox::LightContainer<LinBox::LightContainer<unsigned long> >> = {allocated = 35, _container = 0x618000029888, _finish = 0x618000029bd0}, _field = 0x61200001f7f0, 
  _rowdim = 35, _coldim = 38, _nnz = 357}
(gdb) p copyA
$1 = {<LinBox::LightContainer<LinBox::LightContainer<unsigned long> >> = {allocated = 35, _container = 0x618000029c88, _finish = 0x618000029fd0}, _field = 0x618000029b28, _rowdim = 35, _coldim = 38, _nnz = 357}
(gdb) p A[0].size()
$2 = 0
(gdb) p copyA[0].size()
$3 = 14
(gdb) p copyA[0][0]
$4 = (unsigned long &) @0x60e000003920: 13744632839234567870
collares commented 2 years ago

Yep, this seems like a probable cause of the bug. MatrixHom::map uses ZeroOne<GF2>::augment, which obtains an iterator by calling A.indexBegin(): https://github.com/linbox-team/linbox/blob/d0d4b9b1eb72a0b9d80af573888a72955670352c/linbox/blackbox/zo-gf2.inl#L401 However, indexBegin() calls IndexedIterator like this: https://github.com/linbox-team/linbox/blob/d0d4b9b1eb72a0b9d80af573888a72955670352c/linbox/blackbox/zo-gf2.inl#L378

This has two problems:

This code was introduced in dc14020d1ff2fc0606b09365ea2b1610fcd76080.

ClementPernet commented 2 years ago

Fixed by #286