OasisLMF / ktools

In-memory simulation kernel for loss modelling.
BSD 3-Clause "New" or "Revised" License
28 stars 19 forks source link

pltcalc `-nan` in standard deviation when sample size = 1 #366

Closed sambles closed 9 months ago

sambles commented 9 months ago

Issue Description

The value -NaN is appearing In the PiWind expected output data all_outputs/output/gul_S1_pltcalc.csv

type summary_id period_no event_id mean standard_deviation exposure_value occ_year occ_month occ_day
1 1 1 1 349520.00 0.00 3400000.00 1 1 1
1 1 2 2 1331440.00 0.00 3400000.00 2 1 1
2 1 2 2 1421750.12 -nan 3400000.00 2 1 1
1 1 2 3 3400000.00 0.00 3400000.00 2 1 1
2 1 2 3 3400000.00 -nan 3400000.00 2 1 1

Hassan Chagani: The issue lies with the standard deviation calculation, if samplesize_ = 1, then you get a division by 0 error.

https://github.com/OasisLMF/ktools/blob/9b2ccef55a5297b94d653c7add1215657a977ef9/src/pltcalc/pltcalc.cpp#L547

hchagani-oasislmf commented 9 months ago

After speaking with @johcarter, we should handle this in the same was we do with eltcalc: if samplesize_ = 1, then the standard deviation is set to 0. This makes sense as if there is only one sample, the mean would be the loss value of that sample and the standard deviation would be 0.