Once we get GPU simulation going we can implement concurrency.
Essentially offload Cerenkov photons to GPU while the CPU is continuing the Geant4 simulations. Then the two would join. Photons can be offloaded to GPU in SteppingAction once a critical number is reached.
And the extended/parallel/ThreadsafeScorers example for geant4-v11.2.2 :
auto tm = dynamic_cast<G4TaskRunManager*>(G4RunManager::GetRunManager());
// Get the thread-pool if available
auto tp = (tm) ? tm->GetThreadPool() : nullptr;
.................
if(tp)
{
// create a task group (nested inside the 'report_type_comparison' task)
G4TaskGroup<std::string> tg(join_output, tp);
// create the tasks in the task-group
for(auto titr = comp.begin(); titr != comp.end(); ++titr)
tg.exec(report_subtype_comparison, titr->first, titr->second);
// wait on the tasks to finish and execute the join function
// this will block the outer task from completing until all the inner
// tasks have been completed
streamout << tg.join();
}
Once we get GPU simulation going we can implement concurrency.
Essentially offload Cerenkov photons to GPU while the CPU is continuing the Geant4 simulations. Then the two would join. Photons can be offloaded to GPU in SteppingAction once a critical number is reached.
In my (vague) understanding G4TaskManager is the class we need. Not many resources to understand how it works, found: https://indico.cern.ch/event/809383/contributions/3371931/attachments/1822464/2981532/G4Tasking.pdf
And the extended/parallel/ThreadsafeScorers example for geant4-v11.2.2 :