mreineck / ducc

Fork of https://gitlab.mpcdf.mpg.de/mtr/ducc to simplify external contributions
GNU General Public License v2.0
14 stars 12 forks source link

sharp: error on small map transforms with many threads #2

Closed samuelsimko closed 3 years ago

samuelsimko commented 3 years ago

The SHT transforms sometimes fail on small maps when the number of threads used is high.

Steps to reproduce

Execute the following code :

import ducc0
import numpy as np

nside = 4
mmax = nside
lmax = nside
nthreads = 10
nb_iter = 100

m = np.random.random(12 * nside ** 2)

job = ducc0.sht.sharpjob_d()
job.set_nthreads(nthreads)
job.set_healpix_geometry(nside=nside)
job.set_triangular_alm_info(lmax, mmax)
for _ in range(nb_iter):
    job.map2alm(m)

This code fails most of the time and throws this error on Linux Fedora 33:

/usr/include/c++/10/bits/stl_vector.h:1045: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = std::complex<double>; _Alloc = std::allocator<std::complex<double> >; std::vector<_Tp, _Alloc>::reference = std::complex<double>&; std::vector<_Tp, _Alloc>::size_type = long unsigned int]: Assertion '__builtin_expect(__n < this->size(), true)' failed.
[1]    1044921 IOT instruction (core dumped)  python reproduce_ducc_err.py

On other operating systems, running the code can throw a segmentation fault instead.

mreineck commented 3 years ago

Thanks a lot! I'll investigate this as soon as possible. Most likely this happens in threads which have nothing to do and nevertheless try doing it ...

mreineck commented 3 years ago

Which version of the code are you using exactly?

samuelsimko commented 3 years ago

0.9.0 (I installed it from the latest commit)

mreineck commented 3 years ago

Can you please try again with the latest commit? I fixed something which repairs the test case at least on my computer.

samuelsimko commented 3 years ago

The fix seems to work on my computer as well, I get no errors anymore. Thank you !

mreineck commented 3 years ago

Great! Please re-open if the problem turns up again!