markbrockettrobson / python_dice

a statistical dice library for python
7 stars 0 forks source link

Any desire to use `dyce` under the hood? #53

Closed posita closed 3 years ago

posita commented 3 years ago

Howdy! I've been working on dyce, which is a pure Python library that started out as an AnyDice replacement (mainly focused on counting and enumeration). I have recently started thinking about objects and APIs that could handle roll generation, math, tracing, etc. so that library authors like yourself could focus on developing specialized grammars.

I think dyce's counting primitives could probably stand in for substantial portions of your probability_distribution subpackage. It might require some refactoring, but you might be able to lean on them quite effectively if you had any desire to do so (i.e., if you wanted to focus more on expanding the grammar and less on the math around finite discrete probabilities). I took a stab at seeing if they could be used to translate some of your examples.

If you're willing, I'd love feedback on why (or why not) dyce might be useful to you. I'd be happy to collaborate more closely, provide explanations, advice, or any other additional information if it would be helpful.

If this doesn't interest you, please feel free to close this without any judgment from me. In any event, it's nice to see what you're working on, and thanks for your time and attention!

markbrockettrobson commented 3 years ago

Thanks for the message I will take a look at your suggestion over the weekend.

markbrockettrobson commented 3 years ago

Sorry did not get the time to look into this last weekend. will book out some time in my calendar this weekend

posita commented 3 years ago

Oh, gosh, no worries. I wouldn't describe this as urgent. I'd love any consideration or feedback you would be willing to provide, of course, but please only treat this as important if you want to.

markbrockettrobson commented 3 years ago

I had a look and i think it will be hard to add the functionality of linked dice using your code. Thanks for pointing this out to me

posita commented 3 years ago

Thanks for taking a look! What are "linked dice" in your world? I'm curious to see whether I could support that feature.

markbrockettrobson commented 2 years ago

Sorry about the delay.

consider the program

VAR a = 1d2 VAR b = (1d2 * a) + a

note here that "a" must have a common value in both locations in b, in this case I would say that the value of the 1st "a" is linked to 2nd "a"

this problem can get more complicated when the value of a can change, think about this

VAR a = 1d20 VAR b = a + 1d4 VAR a = a - 2d4 VAR c = b + a

in this case the value of C relates twice to the original value of a

correct me if I am mistaken your Dyce cant hold a probability distribution with constraints like this across calls

posita commented 2 years ago

Ah! I see what you're saying now. No, dyce does not yet have a(n easy) mechanism to define/describe dependent variables. UPDATE: dyce does have a mechanism for dependent probabilities. See below A use case arose for me recently (independent of this discussion) that got me thinking about how I might add them.

Just so I understand details, can your second example be re-characterized as follows?

VAR a1 = 1d20
VAR b = a1 + 1d4
VAR a2 = a1 - 2d4
VAR c = b + a2

Or would it be interpreted in some other way?

markbrockettrobson commented 2 years ago

that is more or less what it will do for you, I did it that way to make room for loops later

posita commented 2 years ago

While your first example can probably be reduced to VAR b = (1d2 + 1) * a, which dyce can represent as (H(2) + 1) * H(2), I take your point that dyce cannot (yet) easily represent something like VAR b = a * a + a + 1. UPDATE: Nope. It can. I don't know why I didn't think of this. See below.

FWIW, the use case I alluded to above was Ironsworn's core mechanic (a draft summary of which—including dyce's crude attempt at calculating probabilities around it—can be found here).

python_dice handles Ironsworn's basic case quite well:

>>> from python_dice import PythonDiceInterpreter
>>> PythonDiceInterpreter().get_probability_distributions_dict([
...   "VAR action_d = d6",
...   "VAR mod = 0",
...   "((action_d + mod) > 1d10) + ((action_d + mod) > 1d10)",
... ])["stdout"]
{0: 0.5916666666666667, 2: 0.09166666666666666, 1: 0.31666666666666665}

Very cool! 👏

markbrockettrobson commented 2 years ago

Nice to see we get the same answer.

markbrockettrobson commented 2 years ago

could be worth adding some A b tests using your system

posita commented 2 years ago

Ah! You know what? I lied. dyce does have a dependent probability mechanism. I don't know why I didn't think of it before. It covers the use cases mentioned here.

Your simple case:

>>> from python_dice import PythonDiceInterpreter
>>> PythonDiceInterpreter().get_probability_distributions_dict([
... "VAR a = 1d2",
... "VAR b = (1d2 * a) + a",
... ])["b"]
{2: 0.25, 4: 0.25, 3: 0.25, 6: 0.25}

>>> from dyce import H
>>> a = H(2)
>>> b = a.substitute(lambda __, a_outcome: H(2) * a_outcome + a_outcome) ; b
H({2: 1, 3: 1, 4: 1, 6: 1})
>>> print(b.format())
avg |    3.75
std |    1.48
var |    2.19
  2 |  25.00% |############
  3 |  25.00% |############
  4 |  25.00% |############
  6 |  25.00% |############

Your more complicated case:

>>> PythonDiceInterpreter().get_probability_distributions_dict([
... "VAR a = 1d20",
... "VAR b = a + 1d4",
... "VAR a = a - 2d4",
... "VAR c = b + a",
... ])["c"]
{1: 0.02265625,
 0: 0.01953125,
 -1: 0.01484375,
 -2: 0.01015625,
 -3: 0.00546875,
 -4: 0.00234375,
 -5: 0.00078125,
 2: 0.02421875,
 3: 0.025,
 4: 0.025,
…
 33: 0.025,
 34: 0.025,
 35: 0.02421875,
 36: 0.02265625,
 37: 0.01953125,
 38: 0.01484375,
 39: 0.01015625,
 40: 0.00546875,
 41: 0.00234375,
 42: 0.00078125}

>>> a = H(20)
>>> c = a.substitute(lambda __, a_outcome: (a_outcome + H(4)) + (a_outcome - 2@H(4))) ; c
H({-5: 1, -4: 3, -3: 7, -2: 13, -1: 19, 0: 25, 1: 29, 2: 31, 3: 32, 4: 32, …, 33: 32, 34: 32, 35: 31, 36: 29, 37: 25, 38: 19, 39: 13, 40: 7, 41: 3, 42: 1})
>>> print(c.format(scaled=True))
avg |   18.50
std |   11.69
var |  136.75
 -5 |   0.08% |#
 -4 |   0.23% |####
 -3 |   0.55% |##########
 -2 |   1.02% |####################
 -1 |   1.48% |#############################
  0 |   1.95% |#######################################
  1 |   2.27% |#############################################
  2 |   2.42% |################################################
  3 |   2.50% |##################################################
  4 |   2.50% |##################################################
 …
 33 |   2.50% |##################################################
 34 |   2.50% |##################################################
 35 |   2.42% |################################################
 36 |   2.27% |#############################################
 37 |   1.95% |#######################################
 38 |   1.48% |#############################
 39 |   1.02% |####################
 40 |   0.55% |##########
 41 |   0.23% |####
 42 |   0.08% |#

A version of my polynomial like case:

>>> PythonDiceInterpreter().get_probability_distributions_dict([
... "VAR a = d6",
... "a * a * d6 + a * d6 + d6",
... ])["stdout"]
{3: 0.0007716049382716049,
 4: 0.0023148148148148147,
 5: 0.004629629629629629,
 6: 0.007716049382716049,
 7: 0.012345679012345678,
 8: 0.016975308641975308,
 9: 0.020833333333333332,
 10: 0.022376543209876542,
 11: 0.023919753086419752,
 12: 0.022376543209876542,
 13: 0.020833333333333332,
 14: 0.016203703703703703,
 15: 0.013888888888888888,
 16: 0.011574074074074073,
 17: 0.010030864197530864,
 18: 0.008487654320987654,
…
 255: 0.0007716049382716049,
 256: 0.0007716049382716049,
 257: 0.0007716049382716049,
 258: 0.0007716049382716049}

>>> a = H(6)
>>> a.substitute(lambda __, a_outcome: a_outcome * a_outcome * H(6) + a_outcome * H(6) + H(6))
H({3: 1, 4: 3, 5: 6, 6: 10, 7: 16, 8: 22, 9: 27, 10: 29, 11: 31, 12: 29, 13: 27, 14: 21, 15: 18, 16: 15, 17: 13, 18: 11, …, 255: 1, 256: 1, 257: 1, 258: 1})
>>> print(_.format(scaled=True))
avg |   68.83
std |   59.30
var | 3515.94
  3 |   0.08% |#
  4 |   0.23% |####
  5 |   0.46% |#########
  6 |   0.77% |################
  7 |   1.23% |#########################
  8 |   1.70% |###################################
  9 |   2.08% |###########################################
 10 |   2.24% |##############################################
 11 |   2.39% |##################################################
 12 |   2.24% |##############################################
 13 |   2.08% |###########################################
 14 |   1.62% |#################################
 15 |   1.39% |#############################
 16 |   1.16% |########################
 17 |   1.00% |####################
 18 |   0.85% |#################
…
 255 |   0.08% |#
 256 |   0.08% |#
 257 |   0.08% |#
 258 |   0.08% |#

The Ironsworn mechanic:

>>> PythonDiceInterpreter().get_probability_distributions_dict([
...   "VAR action_d = d6",
...   "VAR mod = 0",
...   "((action_d + mod) > 1d10) + ((action_d + mod) > 1d10)",
... ])["stdout"]
{0: 0.5916666666666667, 2: 0.09166666666666666, 1: 0.31666666666666665}

>>> action_d = H(6)
>>> mod = 0
>>> action_d.substitute(lambda __, action_d_outcome: 2@(H(10).lt(action_d_outcome + mod)))
H({0: 71, 1: 38, 2: 11})
>>> print(_.format())
avg |    0.50
std |    0.66
var |    0.43
  0 |  59.17% |#############################
  1 |  31.67% |###############
  2 |   9.17% |####

That being said, there very well may be some things python_dice can do with dependent probabilities that would be more cumbersome to represent in dyce, but the implementations may be closer than I original remembered!

One thing that jumps out in the above examples is the readability of a dedicated grammar. python_dice's syntax hardly needs any explaining to see what's going on, even if you're not familiar with it. dyce's substitution syntax is not very intuitive, either to read or write. The Ironsworn case above is illustrative. It's definitely not as easy to see what's going on in the dyce version.

I'm pretty sure dyce is (currently) more computationally efficient, though:

In [1]: from python_dice import PythonDiceInterpreter

In [2]: from dyce import H

In [3]: interpreter = PythonDiceInterpreter() ; program = [
   ...: "VAR a = 1d20",
   ...: "VAR b = a + 1d4",
   ...: "VAR a = a - 2d4",
   ...: "VAR c = b + a",
   ...: ]

In [4]: %timeit interpreter.get_probability_distributions_dict(program)
2.34 s ± 320 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [5]: a = H(20)

In [6]: def sub_c(__, a_outcome):
   ...:     b = a_outcome + H(4)
   ...:     a = a_outcome - 2@H(4)
   ...:     return b + a

In [7]: %timeit a.substitute(sub_c)
8.93 ms ± 404 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [8]: program = [
   ...: "VAR a = d6",
   ...: "VAR b = d6",
   ...: "((b + a) * d6) + ((b - a) * d6)",
   ...: ]

In [9]: %timeit interpreter.get_probability_distributions_dict(program)
5.5 s ± 181 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [10]: a = H(6) ; b = H(6)

In [11]: def sub_a(__, a_outcome):
    ...:     def sub_b(__, b_outcome):
    ...:         return (b_outcome + a_outcome) * H(6) + (b_outcome - a_outcome) * H(6)
    ...:     return b.substitute(sub_b)

In [12]: %timeit a.substitute(sub_a)
15 ms ± 458 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
posita commented 2 years ago

You have (perhaps inadvertently) inspired me to implement dyce's (currently experimental) resolve_dependent_probability function which (as of 0.4.4) is now dyce's third option for resolving dependent probabilities! 🎉 Here are translations of some examples from above:

VAR a = 1d2 ; VAR b = (1d2 * a) + a
(single independent term, single dependent term)
``` python >>> # Translation of >>> # VAR a = 1d2 >>> # VAR b = (1d2 * a) + a >>> from dyce import H >>> from dyce.h import resolve_dependent_probability >>> d2 = H(2) >>> a = d2 # independent term >>> def b(a_outcome): # dependent term ... return d2 * a_outcome + a_outcome >>> # Option 1 - explicit >>> def explicit(): ... for a_outcome, a_count in a.items(): ... term = b(a_outcome) ... for term_outcome, term_count in term.items(): ... yield term_outcome, term_count * a_count >>> H(explicit()) H({2: 1, 3: 1, 4: 1, 6: 1}) >>> # Option 2 - nested substitution >>> def sub_term(__, a_outcome): ... return b(a_outcome) >>> a.substitute(sub_term) H({2: 1, 3: 1, 4: 1, 6: 1}) >>> # Option 3 - resolve_dependent_probability >>> resolve_dependent_probability(b, a_outcome=a) H({2: 1, 3: 1, 4: 1, 6: 1}) ```
VAR a = 1d20 ; VAR b = a + 1d4 ; VAR a = a - 2d4 ; VAR c = b + a
(single independent term, multiple dependent terms)
``` python >>> # Translation of >>> # VAR a = 1d20 >>> # VAR b = a + 1d4 >>> # VAR a = a - 2d4 >>> # VAR c = b + a >>> from dyce import H >>> from dyce.h import resolve_dependent_probability >>> d4 = H(4) >>> d20 = H(20) >>> a = d20 # independent term >>> def b(a_outcome): # dependent term ... return a_outcome + d4 >>> def a1(a_outcome): # dependent term ... return a_outcome - 2 @ d4 >>> def c(a_outcome): # dependent term ... return b(a_outcome) + a1(a_outcome) >>> # Option 1 - explicit >>> def explicit(): ... for a_outcome, a_count in a.items(): ... term = c(a_outcome) ... for term_outcome, term_count in term.items(): ... yield term_outcome, term_count * a_count >>> option1 = H(explicit()) >>> option1 H({-5: 1, -4: 3, -3: 7, -2: 13, ..., 39: 13, 40: 7, 41: 3, 42: 1}) >>> # Option 2 - nested substitution >>> def sub_term(__, a_outcome): ... return c(a_outcome) >>> option2 = a.substitute(sub_term) >>> option2 == option1 True >>> # Option 3 - resolve_dependent_probability >>> option3 = resolve_dependent_probability(c, a_outcome=a) >>> option3 == option2 True ```
Ironsworn solo mechanic translated to python_dice
(multiple independent terms, single dependent term)
```python >>> # Translation of >>> # VAR action = d6 >>> # VAR first_challenge = d10 >>> # VAR second_challenge = d10 >>> # VAR iron_solo = (action > first_challenge) + (action > second_challenge) + ((action > first_challenge) AND (first_challenge == second_challenge)) - ((action <= first_challenge) AND (first_challenge == second_challenge)) >>> # ... which takes about 35s on my machine to arrive at ... >>> # {-1: 0.075, >>> # 0: 0.5166666666666667, >>> # 3: 0.025, >>> # 1: 0.31666666666666665, >>> # 2: 0.06666666666666667} >>> from dyce import H >>> from dyce.h import resolve_dependent_probability >>> d6 = H(6) >>> d10 = H(10) >>> action_d = d6 # independent term >>> first_challenge_d = d10 # independent term >>> second_challenge_d = d10 # independent term >>> def iron_solo(action, first_challenge, second_challenge): # dependent term ... return (action > first_challenge) + (action > second_challenge) + ((action > first_challenge) & (first_challenge == second_challenge)) - ((action <= first_challenge) & (first_challenge == second_challenge)) >>> # Option 1 - explicit >>> def explicit(): ... for action_outcome, action_count in action_d.items(): ... for first_challenge_outcome, first_challenge_count in first_challenge_d.items(): ... for second_challenge_outcome, second_challenge_count in second_challenge_d.items(): ... yield ( ... iron_solo(action_outcome, first_challenge_outcome, second_challenge_outcome), ... action_count * first_challenge_count * second_challenge_count, ... ) >>> option1 = H(explicit()) >>> option1.lowest_terms() H({-1: 9, 0: 62, 1: 38, 2: 8, 3: 3}) >>> print(option1.format()) avg | 0.45 std | 0.83 var | 0.68 -1 | 7.50% |### 0 | 51.67% |######################### 1 | 31.67% |############### 2 | 6.67% |### 3 | 2.50% |# >>> # Option 2 - nested substitution >>> def sub_action(__, action_outcome): ... def sub_first_challege(__, first_challenge_outcome): ... def sub_second_challege(__, second_challenge_outcome): ... return iron_solo(action_outcome, first_challenge_outcome, second_challenge_outcome) ... return second_challenge_d.substitute(sub_second_challege) ... return first_challenge_d.substitute(sub_first_challege) >>> option2 = action_d.substitute(sub_action) >>> option2 == option1 True >>> # Option 3 - resolve_dependent_probability >>> option3 = resolve_dependent_probability( ... iron_solo, ... action=action_d, ... first_challenge=first_challenge_d, ... second_challenge=second_challenge_d, ... ) >>> option3 == option2 True ```

No worries if these aren't directly useful to you. I can imagine it might be some work to determine which VARs were dependent vs. independent in such a way that you could translate that to a call to resolve_dependent_probability. If you could, though, it looks like deferring the computations to dyce could save you a bundle on car insurance computation times.

In any event, thanks for the discussion and the inspiration! 🙇