KratosMultiphysics / Kratos

Kratos Multiphysics (A.K.A Kratos) is a framework for building parallel multi-disciplinary simulation software. Modularity, extensibility and HPC are the main objectives. Kratos has BSD license and is written in C++ with extensive Python interface.
https://kratosmultiphysics.github.io/Kratos/
Other
1.03k stars 245 forks source link

usage of c++11 thread_local and openmp #7928

Open RiccardoRossi opened 3 years ago

RiccardoRossi commented 3 years ago

Description In principle there is no guarantee that thread_local is compatible with c++11, however the following code seems to work reliably with recent gcc and clang

can anyone try this with msvc or intel?

#include <vector>
#include <iostream>
#include <chrono>
#include <cassert>

#include "omp.h"

class tls{
    public:
    std::vector<double> vec;
};

class A
{
    public:

    static tls& GetTLS()
    {
        thread_local static tls my_tls; //----> note that here it is defined as thread_local!
        return my_tls;
    }

    static std::size_t GetTlsSize()
    {
        return GetTLS().vec.size();
    }
};

int main() {

    std::cout << " n threads = " << omp_get_max_threads() << std::endl;    

    A a;
    unsigned int tot=0;
    #pragma omp parallel
    {
        a.GetTLS().vec.resize(omp_get_thread_num());
    }

    std::cout << a.GetTlsSize() << std::endl;

    #pragma omp parallel
    {    
        assert(a.GetTlsSize()==omp_get_thread_num());

        #pragma omp atomic 
        tot += a.GetTLS().vec.size();
    }

    int reference_tot = 0;
    for(unsigned int i=0;i<omp_get_max_threads(); ++i)
        reference_tot+=i;
    assert(tot==reference_tot);
    std::cout << tot << std::endl;

    return 0;
}
philbucher commented 3 years ago

with Intel compiler it seems to work, I tested 1-20 threads (used to compile: icpc threadlocal_test.cpp -fopenmp) Version: icpc (ICC) 19.0.5.281 20190815

philbucher commented 3 years ago

@pooyan-dadvand can you also post the configuration that you tested this with?

pooyan-dadvand commented 3 years ago

I have tested it with MSVC 2017 x64: cl test_thread_local.cpp /openmp

and running several times with 8 threads works fine:


>test_thread_local.exe
 n threads = 8
0
28
``