Closed mattldawson closed 4 years ago
@cguzman95 - I can work on this. Some of the things you're trying to sum with these variables are available already in solver_stats_t
. I'll add the rest and set up the mock_monarch.F90
program to sum them. But, it's important that we don't introduce global variables into the code because multiple cores should be able to be used simultaneously.
As you wish, I'm fine as long as the counters keep inside DEBUG flags (or other names) and the print solver_stats function is easy to see and identify what I am printing.
Yes, these variables like timeDerivGPU should be summed, if you can print the results in mock_monarch instead of the badly located place that they are, it would be nice.
A note about global variables and cores: If you refer MPI, each thread will have his own variable (ex: counter). So, each thread will print his own counter and it will be correct (the global variable won't be shared along with the rest of cores). If you refer OpenMP, then yes, is a problem. But I doubt you refer OpenMP because there are a lot of more things to take into account if we want to parallelize with OpenMP
ok, great - I'll do the modification.
I was thinking about having multiple cores on the same CPU, but I hadn't thought about OpenMP. The scenario I was imagining is similar to the new unit tests where we have two cores running side-by-side to test single- and multiple-core solving, but instead someone might be running two different mechanisms at the same time to compare (or combine somehow) their results during the run; or be testing a base mechanism and a modified version. Something like this
Then, all variables will be independent. Seems good to avoid the loop of repeating a mechanism if the user wants to save some time.
Hi @cguzman95 - I moved the timers and counters to SolverData. If you look in the test_cb05cl_ae5.F90
test and look for the comment with TIMERS
in it, you can see how to access them in a test. They are reset for each call to solve()
so if you want cumulative values, you'll have to add variables to the test to hold them. Also, since either (but not both) the CPU or GPU functions will be used during a run, I only included one set of timers (one for f() and one for Jac()).
The timers in
camp_solver.c
should be moved to the SolverData struct, and their results should be included in thesolver_stats_t
type. The counters are already available here. If total stats for multiple calls tocamp_core_t%solve()
are needed, they should be summed after the call tosolve()
.