Bai-Li-NOAA / Age_Structured_Stock_Assessment_Model_Comparison

Other
0 stars 6 forks source link

feat(generate_case): Adds code for C1 and C2 #9

Closed kellijohnson-NOAA closed 2 years ago

kellijohnson-NOAA commented 2 years ago

Uses generate_case() in README.md instead of huge block of code.

Creates two TODO's for @Bai-Li-NOAA. One is a documentation and another is checking that generating 100 sims instead of 120 and throwing 20 away is okay to do in C1.

Failed to implement lexical scoping :( I am so sad about this. I tried to populate my workspace with all of the needed objects inside of generate_case() but when I would call save_initial_input() it would ALWAYS use the .GlobalEnv rather than the local environment of generate_case(). The way the list is created now does not facilitate checking that all of the needed list members are present. So, if save_initial_input() were to add a list item generate_case() would fail to ensure that the new item is present. I am happy to change things here if someone has a different way of coding this. I honestly think that a good option here would just be to populate the list in save_initial_input() with the values from case 1 or any other case and then always change those using base_case = FALSE. This option would clean up the code base because it is not the best practice to use global variables inside of a function.

Before this moves from draft to an actual pull request there are additional calls in example and inside some functions that might be able to use generate_case() instead of starting from scratch.

kellijohnson-NOAA commented 2 years ago

Also, we need to decide what to include as "data objects". For example would results from C1 with run_om() be read back into R and stored as an object so users do not have to run anything but have access to the results? We could do this but just store one iteration or maybe 10 to do the plotting? @Bai-Li-NOAA I am not sure how big these data objects would be or if you even want them stored.

Bai-Li-NOAA commented 2 years ago

I agree that it is not the best practice to use global variables inside of a function. The fleet_num bug is a good example. Populating the list in save_initial_input() with the values from a case and then always changing those using base_case = FALSE sounds good.

Generating 100 sims instead of 120 is okay to do in C1, but it is easier to start with a bigger number because the EM sometimes has some convergence issues for some of the iterations. This way, we can make sure we have 100 sets of converged results in the end.

Currently, the results from the last iteration of C1 are stored in R and results from all iterations are stored locally and it is about 50KB per iteration. I think we can use the last iteration of C1 for plotting and not read the rest of the results in the R environment. We can use run_em() to load OM results and run estimation models.

kellijohnson-NOAA commented 2 years ago

run_OM() is super fast. So, I think setting the default of om_sim_num = 120 would be fine.

Going forward I will

Questions for @Bai-Li-NOAA

  1. When saving the data object do you want more than om_input and om_output saved?
  2. Should I save them as om_input_C1_100 and om_output_C1_100 or would you prefer shorter names?
Bai-Li-NOAA commented 2 years ago
  1. I think we need om_input , om_output, and em_input.
  2. om_input_C1_100 and om_output_C1_100 works fine with me

I will work on