sizespectrum / mizerExperimental

Extends the mizer package with experimental features
https://sizespectrum.org/mizerExperimental/
3 stars 5 forks source link

Improving getYieldVsF #43

Closed baldrech closed 2 years ago

baldrech commented 2 years ago

This pull request fixes #32

It introduces three new functions:

Furthermore, now getYieldVsF tries as much as possible to run new simulation starting with a steady state as close as possible to do shorter runs.

I am still working on refining the different user's cases, proper documentation/testing and cleaning the code but I also have a few question for you Gustav in the meantime:

gustavdelius commented 2 years ago

This'll be so useful! Thanks @baldrech

To answer your questions:

species don't go extinct in the NS_params object even when being fished disproportionally. The current threshold in getMaxF is set to stop fishing when the biomass exposed to fisheries mortality drops below 10% of the initial amount of biomass. However we can set up any kind of threshold or even make it user dependent. Should we keep this threshold or use something else?

That is a very good observation. It shows that the North Sea model is not set up well, which we have known for some time but we just haven't gotten around to replacing it with a better-calibrated model. Species that are fished at sizes smaller than their maturity size should crash if the fishing mortality is increased. However if the gear only starts selecting at sizes above maturity size, then the species will not go extinct. So perhaps we should also stop increasing F once the biomass does not change much any more.

I don't think we know what the best values for thresholds are, so for now at least they need to stay user-configurable.

distanceSSLogYield is based on the other "distance" functions but I added a different criterion than the SSE, so instead it detects when the proportion of change is less than 1%. Happy with that or do you prefer SSE? I also hard coded the tol argument that decides the "1%" within getYieldVsF

If one looks for example at

plotYieldVsF(params, "Haddock", no_steps = 40)

Rplot

one gets the impression that during the initial sweep the runs were stopped too early. So I think you need to experiment a bit with the stopping criterion. There is the problem that it takes time before the drop in recruitment created by fishing on the spawning stock makes itself felt in a further reduction of large fish, in particular for slow-growing species. So perhaps my idea to use the yield as a stopping criterion was not so clever after all.

distanceSSLogYield calculates all species yield at the moment. Because of this, calculating the steady state with 0 fishing effort takes the most time as the release in fisheries mortality causes the most fluctuations in the ecosystem, making the other yield take some time before settling in. Should I leave it like this or just focus on the yield of the targeted species?

I would focus on the targeted species, but not necessarily on its yield, see previous comment. I do not understand why specifically fishing mortality of 0 takes the longest time.

the goal of the pull request at first was to make getYieldVsF go faster but now that it calculates the maximum effort and species won't die, it takes a lot more simulations... Fortunately, rearranging how yields are calculated compensate the additional runs and it takes the same time as before approximately. Because of this I left a case where the user can determine F_range (but not F_max which is going to be supplied within F_range anyway), happy with the arrangement?

I think it would be nice to keep F_max.

baldrech commented 2 years ago

I added a maxFthreshold function within getMaxF to be able to customize the maximum fisheries threshold for the function. Furthermore, the arguments distance_func, tol, max_func and threshold_var have been added to plotYieldVsF so that the user can customize the distance function and max threshold from there. I put F_max back too so getMaxF will be used only if both F_max and F_range are missing.

At the moment, the yield calculated by getMaxF and the ones added afterwards have different results because they are projecting to steady from different starting point. Since the sim stops before it is perfectly stable (because we want it fast), differences accumulate over time (i.e. at highest effort). Using distancelogN and tol = 0.01 minimises differences but takes longer time because the simulations need longer to get stable.

So we have get this trade-off of speed versus smoothness. To solve it we can: