Reconfigure unit tests - Githubissues

mattldawson commented 5 years ago

There is currently a lot of redundancy in the reaction and sub model unit tests related to the initialization and running of the model. This makes it time consuming to add tests of, for example multi-cell solving. We should discuss a framework for these tests so that only the mechanism and evaluation of results are specific to each test and the rest is in one location for all tests.

cguzman95 commented 5 years ago

Pseudocode to discuss:

Declare common data variables (time_step, temp_array, pressure_array...)
Do N cases: {
- Define specific variables (k1,k2,species_array...)
- Initialize specific and common variables (temp_array, k1,k2...)
- call load(interface_config_file)
- phlex_core => phlex_core_t(input_file_path, n_cells)
- call phlex_core%initialize()
- Get species indices (map_id) ->Doubt: The one from monarch is okay and covers all the cases?
- call load(interface_config_file)
- call phlex_core%solver_initialize()
- Do time iterations: {
- Update phlex_state%env_state and phlex_state%state_var
- call phlex_core%solve(state, time_step, solver_stats)
- Update model_conc(:) = phlex_state%state_var -> Doubt: This is necessary? I understand that for mock_monarch we have different structure with the vertical cells and others, but for unit test could be possible to work only with the phlex_state%state_var? }
  - Check results
  - Save results
  - Deallocate arrays ->Doubt: I guess the number of species depends on the test, I don't know if we have more arrays like this }
Deallocate common variables (input_file,temp...)

mattldawson commented 5 years ago

I think it is important to separate the common code from the code that is specific to individual tests into individual modules. Otherwise, this test program is going to get way too big. (Just take a look at test_sub_model_UNIFAC.F90 to see how complex individual tests can be.) Some other points:

the species on the state array are different for every test
there's no need for mapping in the unit tests - that's only included in the MONARCH test because MONARCH has it's own state array (with species arranged differently)

Here's what I would suggest: Have a program file unit_test_driver.F90 something like this:

program unit_test_driver

  class(unit_test_data_t), pointer :: unit_test_data

  unit_test_data => COMPILER_FLAG_WITH_TEST_CLASS

  ... (declare variables) ...

  phlex_core => phlex_core_t(unit_test_data%input_file_name())

  call phlex_core%initialize()

#ifdef USE_MPI
  pass phlex_core to other processes
#endif 

  ! Set the initial states (unit_test_data_t classes should provide a certain number of
  ! unique initial states that can be analyzed during solving)
  do i_cell=1, N_CELLS
    grid_cell_state(i_cell) => phlex_core%new_state()
    grid_cell_state_id(i_cell) = rand() % unit_test_data%num_unique_states()

    ! Set the initial state for this grid cell to grid_cell_state_id
    call unit_test_data%initialize_state(phlex_core, i_cell, grid_cell_state(i_cell), grid_cell_state_id(i_cell)
  end do

  ! Loop over time
  ! Loop over cells
    call phlex_core%solve(grid_cell_state(i_cell), time_step)

    ! (or solve multi-cell)

    call unit_test_data_t%analyze(phlex_core, i_cell, grid_cell_state(i_cell), grid_cell_state_id(i_cell), time)

    ! output state to file

  deallocate pointers and arrays

end program unit_test_driver

Then, each individual unit test will extend an abstract unit_data_test_t type and provide the needed functions: input_file_name() num_unique_states() initialize_state() and analyze()

cguzman95 commented 5 years ago

Yes, the parts like "Initialize specific and common variables" will be calls to the specific case functions (arrhenius%get_init_variables, etc.).

I don't totally get the "rand() % unit_test_data%num_unique_states()" : It is for disorder the state array?

I guess %analyze is checking the results.

About abstract unid_data_test_t... I don't know the best way to organize the data on Fortran, but I propose has a module for each unit test, with functions like set_arrhenius or check_arrhenius.

Also, I miss the loop for N unit tests that we have. In this way, for example, the checksum function can be something like "checksum(index_unit_test)", calling a function that has a switch case with all the checksum functions (case 1 check_arrhenius, case 2 check_emission...)

mattldawson commented 5 years ago

Yeah, the rand() % unit_test_data%num_unique_states() (or something similar, it doesn't have to be exactly this) is just to randomly set all the cell initial states, but keeping track of what conditions each cell started with so the results can be analyzed by the specific unit test module after solving.

For the unit_data_test_t, yes I would suggest a module for each unit test, but one that includes a derived type that exposes its functionality, so that it can be passed to the driver using a single compiler variable. It might look something like:

module unit_test_arrhenius

  type, extends(unit_test_data_t) :: unit_test_arrhenius_t
  contains
    procedure :: input_file_name
    procedure :: num_unique_states
    procedure :: initialize_state
    procedure :: analyze
  end type

contains

  function input_file_name()
    !...(etc.)...
  end function input_file_name

  !...(the other functions here)...

end module unit_test_arrhenius

I'm not quite sure I follow the N loops, but if you want to include that, that's fine with me. But, we're not actually doing checksums to check the results - each unit test will check the results for each grid cell by whatever method it wants (they're all different) based on which set of initial conditions that cell started with. A strict checksum wouldn't work for these tests because there is some acceptable amount of variability in the results. The final states are evaluated using a relative and absolute tolerance based on the solver tolerances.

cguzman95 commented 5 years ago

So, this extends will inherit the components of the main class... yeah, seems nice to avoid declare again a common variable.

The N loops I only means instead of having

check_arrhenius(... check_emission(... check_troe(... ...

has

for N tests{ ... check_all(id) ... } check_all{ switch case 1 check_arrhenius... ... }

Yeah, I wrote checksum by I mean a checking (the almost_equal function or whatever it uses each test), my mistake.

Everything seems okay for me.

mattldawson commented 5 years ago

Ah, ok - perfect. I like it. I think we can make it even a little simpler and not need the switch statement. Let me start a branch and rough in my idea. Then you can check it out and see if it will work for you too.

mattldawson commented 4 years ago

I roughed in some code for the unit tests in develop_92_unit_tests. See what you think - it's still very rough, but the general structure is there.

cguzman95 commented 4 years ago

Ok, I will follow this structure and if I thought something should be different I will comment it with the reasons

cguzman95 commented 4 years ago

I just upload a version with the structure mostly done, you can check it and if something you disagree we can comment it tomorrow (most important changes is that I'm not using unit_test_data because at the moment I don't see common variables between the test files)

mattldawson commented 4 years ago

Sounds great - I'll check it out

cguzman95 commented 4 years ago

I think we should improve the line

grid_cell_state_id(i_cell) = pmc_rand_int(unit_test%num_unique_states())

by ensuring that we execute at least all the unique states

mattldawson commented 4 years ago

Hi @cguzman95 - as we discussed, it's fine to make this change if you want, but I think it might be easier to just increase the number of cells. For example, with 20 cells and 5 unique states, there is a 98.8% chance of any particular state being tested; increasing to 40 cells makes this 99.99%.

mattldawson commented 4 years ago

Hi @cguzman95 - I think I'm done with the Arrhenius test. It's merged in to chem_mod now, so whenever you want to work on the other tests is cool. I would suggest starting with the 2 CMAQ reactions and the Troe reaction, since they're the most similar to the Arrhenius one. Let me know if anything is unclear about the test setup.

cguzman95 commented 4 years ago

Yeah, after fix the gpu implementation I will finish the test and test all correctly. Another topic, have you managed to fix the multi-cell bug?

mattldawson commented 4 years ago

I think so - the results of the single cell and multi-cell solvers are the same now, and if they differ the new test will fail. But, I may come back to guess_helper() function to see if it can be made more efficient (maybe by treating cells individually in multi-cell solving)

cguzman95 commented 4 years ago

Hi @mattldawson - Most of the tests done, pushed in the last commit. RXN reactions passed to the new format. Before passing sub_models and/or others, some notes:

a) Pending implement the multi-cell update_data for the reactions that use it. As we accorded, I let this to you. The multi-cell check of unit_test_driver.f90 is commented on this commit to checking the other things works. b) The files of unit_rxn_data are not deleted for the moment, I let the last check before deleting them to you. c) mock_monarch reduced the number of cells to let test_monarch_2 finish d) Changed Cmake new unit test compile function "set_source_files_properties" per "target_compile_definitions". The previous function was rewriting the UNIT_TEST_MODULE and other variables producing error compilation when we compile more than one test. Now is working correctly e) Good amount of the tests has the same initial values of temp and pressure for all the unique states: This is because if not the tests (like troe test) fails, I guess because it needs to set to correct values that I unknow. I left to you this part (or I can do something like set temp1 to 300 and temp2 to 301, but I guess you will do better this part) f) MPI is not tested. When I try to compile with MPI it gives me a strange error in src/util.f90 on use mpi, as it can find the module (but I'm loading openmpi before so don't know what could be). I want to prioritize other things than trying to fix the MPI execution if you let me (issue #114 ). So I will appreciate it if you can check the MPI execution. g) A strange bug happens (as usual). aqueous_equilibrium_test is failing in execution with the error "ERROR (PartMC-883729396): Missing species F". The strange is that the previous species like "D" are found, and the JSON file is a copy-paste of the original. I inspected the file various times but couldn't find the error. Maybe you can find it faster since you know more about the functions that read and search the species. If it is also taking so much for you I can do more intense debug. Let me know if you can fix it, thanks. g2) A strange bug happens (x2). aqueous_equilibrium_test_2 is failing on compilation. I don't know if this is related to the previous error, but aqueous_equilibrium_test_1 compiles successfully, the other tests with _2 too, and CMAKE configuration seems correct, so no idea. h) condensed_arrhenius_test_2 is so slow, don't know why, so I reduce the number of time-steps (this is a minor bug so if it seems complicated I can write the issue and let it in the backlog). i) Only rxn reactions are passed to the new format (all working on make test except the aqueous). You mentioned that sub_model can be passed too, what about the aero_phase and aero_rep tests? Also, adapt the rxn tests takes so much, I think for the rest of the tests I can simplify some things to adapt them faster (like reducing the number of mentions to specific name functions (_1RA, _1RB, _1CB........) using a matrix of conc instead of an array or something like that). If you let me test these simplifications then nice, if you want to follow the structure meticulously then I can accept it (but I will translate the tests with more calm).

While you respond to this notes and deal with them, I will work on issue #114 . I have to prioritize now adapt the other reactions (the ones that are not test_cb05 reactions) or at least argue why are not adapted and let them be executed on the CPU meanwhile the others are executed on GPU

mattldawson commented 4 years ago

Hi @cguzman95 - sounds like you've been busy! I will try to go through the tests and address your notes. Once everything looks good, I'll do a pull request. I'm focused right now on getting things running in MONARCH, but if I'm waiting on an auto-submit run or something, I can start working on these.

cguzman95 commented 4 years ago

Nice. The only thing that can have pressure is to adapt the multi-cells update_rata, but it is secondary with respect to adapting CAMP in Monarch.

compdyn / partmc

Reconfigure unit tests #92