printfn / fend

Arbitrary-precision unit-aware calculator
https://printfn.github.io/fend
MIT License
587 stars 50 forks source link

Allow obtaining the mean of a distribution #285

Closed haykam821 closed 3 months ago

haykam821 commented 3 months ago

Fixes #284

codecov[bot] commented 3 months ago

Codecov Report

Attention: Patch coverage is 91.66667% with 3 lines in your changes are missing coverage. Please review.

Project coverage is 82.98%. Comparing base (5e04cb9) to head (4580c3a). Report is 7 commits behind head on main.

Files Patch % Lines
core/src/num/dist.rs 86.66% 2 Missing :warning:
core/src/error.rs 0.00% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #285 +/- ## ========================================== + Coverage 82.89% 82.98% +0.09% ========================================== Files 52 52 Lines 14947 15000 +53 ========================================== + Hits 12390 12448 +58 + Misses 2557 2552 -5 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

haykam821 commented 3 months ago

I'm not sure how the 'there must be at least one part in a dist' error can be produced; I assume it's a failsafe prevented by parsing (e.g. d0 is invalid). I assume I can ignore the missing coverage as a result.

bgkillas commented 3 months ago

probably too annoying to implement for the little performance difference in most scenarios(aspecially because of how poor the performance of dice is to begin with) but to get the mean for dice it can just be (dice.sum()+dice.len())/2 via some simple logic about how the distribution is in the center always, and wouldn't mode/median be the same as mean? unless there are non dice distributions then it makes sense

haykam821 commented 3 months ago

I realize that I implemented the mean calculation incorrectly; the current formula should work for all currently supported distributions, but a skewed distribution will have an incorrect mean due to the formula not accounting for probabilities.

The correct calculation is more like the following, with no division at the end:

result = Exact::new(k, true)
    .mul(&Exact::new(Complex::from(Real::from(v)), true), int)?
    .add(result, int)?;

The big rational to real to complex to exact conversion is a bit verbose, so if there's a more direct way I can do this, I can use that method instead.

printfn commented 3 months ago

Thanks for the PR! It is actually already possible to generate skewed distributions with fend, for example:

> d6 / d2
0.5:  8.33%  ###############
  1: 16.67%  ##############################
1.5:  8.33%  ###############
  2: 16.67%  ##############################
2.5:  8.33%  ###############
  3: 16.67%  ##############################
  4:  8.33%  ###############
  5:  8.33%  ###############
  6:  8.33%  ###############
> mean(d6 / d2)
approx. 2.8333333333
> 0.5*1/12 + 1*2/12 + 1.5*1/12 + 2*2/12 + 2.5*1/12 + 3*2/12 + 4/12 + 5/12 + 6/12
2.625

Would you be able to fix this bug and add a test case for it?

I'm not sure how the 'there must be at least one part in a dist' error can be produced; I assume it's a failsafe prevented by parsing (e.g. d0 is invalid). I assume I can ignore the missing coverage as a result.

The code coverage checks are just to remind people to write the occasional test. There's no need to worry about small areas of missed coverage like that.

The big rational to real to complex to exact conversion is a bit verbose, so if there's a more direct way I can do this, I can use that method instead.

You're right, there's no shorter way to do that conversion atm unfortunately.

haykam821 commented 3 months ago

Thank you! I had a feeling there was something I missed that would allow for skewed distributions. In that case, I've committed the fix along with test cases.