thliebig / openEMS

openEMS is a free and open-source electromagnetic field solver using the EC-FDTD method.
http://openEMS.de
GNU General Public License v3.0
413 stars 146 forks source link

Unable to get the number of threads within an operator extension #126

Open gadiLhv opened 10 months ago

gadiLhv commented 10 months ago

I am building a new operator and engine extension right now (again) that handles large groups of mesh cells. While trying to speed up the run, I attempted to utilize multiple threads.

However, I couldn't find a way to query the number of threads in the simulation from within the operator\engine. I couldn't befriend the Operator_Multithread class, as there was a conflict in the update_openEMS.sh script. They are "unaware" of each other during the linking stage.

I reverted, eventually to add another constructor that accepts the number of threads in openEMS::SetupFDTD

I would a appreciate if there is a suggestion for a more elegant solution.


PS


I did see ~25% speedup with the multi-threaded version I ran, so there is motivation for this.

biergaizi commented 2 months ago

However, I couldn't find a way to query the number of threads in the simulation from within the operator/engine.

This is not how the multithreading logic works in openEMS. According to the current design, an extension itself should not launch any thread. It's the responsibility of the engine to launch threads. Instead, an extension should implement the API SetNumberOfThreads(). Before a simulation is started, the main engine calls this API, and the extension will know how many threads are there, and it will split its workload internally at this moment. It's up to each extension to decide exactly how the workload is spitted, but usually it's a even split. Then, instead of implementing the default APIs (e.g. DoPreVoltageUpdates()), it implements a multi-threaded version called e.g. DoPreVoltageUpdates(int threadID). The main engine will call these functions once per thread, and the extension should decide where to update according to the given threadID.

For an example, if an extension has 100 workitems in single-thread mode:

void DoPreVoltageUpdates()
{
    for (int i = 0; i < 100; i++)
    {
        do(i);
    }
}

To add multithreading support to this extension, first, implement the function SetNumberOfThreads(int nrThread) to divide your workloads according to the number of threads, then save the thread-workload relationship into a lookup table or array - it's up to you. Then, implement function DoPreVoltageUpdates(int threadID) like the following:

void DoPreVoltageUpdates(int threadID)
{
    for (int i = startPerThread[threadID]; i < stopPerThread[threadID]; i++)
    {
        do(i);
    }
}

See the source code of engine_ext_upml.cpp to see how it's being done.

But I suggestion is not to worry too much about the threading issue. I'm (still) in the process of completely rewrite the multithread logic in my new openEMS Tiling engine, which would make all of what I've said become obsolete ;-)

gadiLhv commented 2 months ago

Hey @biergaizi,

Thank you for taking the time to respond. I'll try to elaborate my problem further.

If you look at the example you sent me, the method

void Engine_Ext_UPML::SetNumberOfThreads(int nrThread)
{
    Engine_Extension::SetNumberOfThreads(nrThread);

    m_numX = AssignJobs2Threads(m_Op_UPML->m_numLines[0],m_NrThreads,false);
    m_start.resize(m_NrThreads,0);
    m_start.at(0)=0;
    for (size_t n=1; n<m_numX.size(); ++n)
        m_start.at(n) = m_start.at(n-1) + m_numX.at(n-1);
}

Are only called in one place:

https://github.com/thliebig/openEMS/blob/1ccf0942477e9178b27f5e00dddd4d62bff78d29/FDTD/extensions/engine_ext_upml.cpp#L35

The next logical step, naturally, was to look which class\function\member contains the data I needed. The answer was pretty straight forward. https://github.com/thliebig/openEMS/blob/1ccf0942477e9178b27f5e00dddd4d62bff78d29/openems.cpp#L254 Which later invokes (I think) this: https://github.com/thliebig/openEMS/blob/1ccf0942477e9178b27f5e00dddd4d62bff78d29/openems.cpp#L632

As I wrote earlier, I later tried to access the Operator_Multithread::getNumThreads method. There was an issue with that due to the linking hirarchy. Namely, as I suggested earlier, access to the Operator_Multithreaded was impossible for my new engine\operator.

I didn't want to fiddle with that too much at this point, so I did a very dirty bypass. The rest, I did as you suggested:

void Engine_Ext_Absorbing_BC::SetNumberOfThreads(int nrThread)
{
    Engine_Extension::SetNumberOfThreads(nrThread);

    // This command assigns the number of jobs (primitives) handled by each thread
    v_primsPerThread = AssignJobs2Threads(m_numPrims,m_NrThreads,false);

    // Basically cumsum. Starting point of each thread.
    v_threadStartPrim.resize(m_NrThreads,0);
    v_threadStartPrim.at(0) = 0;
    for (size_t threadIdx = 1; threadIdx < v_threadStartPrim.size(); threadIdx++)
        v_threadStartPrim.at(threadIdx) = v_threadStartPrim.at(threadIdx - 1) + v_primsPerThread.at(threadIdx - 1);
}

void Engine_Ext_Absorbing_BC::DoPreVoltageUpdates(int threadID)
{
    // if (IsActive()==false) return;

    if (m_Eng==NULL) return;

    if (threadID >= m_NrThreads)
        return;

    uint pos[] = {0,0,0};
    uint pos0[] = {0,0,0};
    uint pos1[] = {0,0,0};
    uint pos_shift[] = {0,0,0};

    uint dir[] = {0,0,0};
    uint primIdx = 0;

    uint cellCtr;

    for (uint primCtr = 0 ; primCtr < v_primsPerThread.at(threadID) ; primCtr++)
    {
        // current primitive index for this thread
        primIdx = primCtr + v_threadStartPrim.at(threadID);
(etc, etc).

I don't know if this is an actual issue or just me misusing the multi-threading structure.

Cheers