gadget-framework / gadget3

TMB-based gadget implemtation
GNU General Public License v2.0
8 stars 6 forks source link

aggregate data before array-ifying in g3l_likelihood_data #120

Open lentinj opened 10 months ago

lentinj commented 10 months ago

If the observation table handed to g3l_likelihood_data has repeated rows (i.e. 2 rows with the same year/step/area/whatever combination), the end result is a complete mess.

Add an aggregation step before the merge to prevent this from happening. The aggregate_fn could be either the default of function (x) { if (length(x) > 1) stop("More than one value for year/step/wotsit combination") ; return(x) } or mean, sum if you know this is what you want to do.

bthe commented 10 months ago

I think it would be best to simply return an error as this typically is the result of an aggregation error.

lentinj commented 10 months ago

Yeah, that was my argument too, and what the default aggregate_fn above would do. This came up because @willbutler42 was suggesting that he wanted it to sum together the repeated entries.

I still feel like this is better done with an explicit step in your own code. However, if we want to detect repeated rows then it's trivial for this machinery to also sum them instead.

bthe commented 10 months ago

Yes, after having a chat with @willbutler42 I understood better where this issue came from. I think it would be good to catch duplicate time steps as I think it is a fairly common (user) error, but what should be the default behaviour I have no strong opinions.