The Trouble with Counting

Jerry-licious commented 3 years ago

In entering the statistics of many cards, a few large problems and a few cases stick out - they are hard to estimate or to represent properly. The following is a list of this kind of problems:

General Issues

Should there be a Shiv count, keeping track of the number of shivs the player generate throughout a deck cycle? If so, will the Shiv cards still be counted in the physical damage counter?
Should there be Orb counts, keeping track of the number of each type of orb that the player generates?
Should Plasma orbs be incorporated into the energy (1 plasma = 2 energy) counter? Or should it stay independent?
Should Stances be counted, keeping track of the amount of times the player enters each stance in the first cycle?
Should Calm and Divinity (blasphemy to be specific) be incorporated into the energy (calm = 2 energy, divinity = 3 energy) counter? Or should they stay independent?
Should energy gain only be counted in a net energy sense? For example, sneaky strike gives the player 2 energy, but costs 2 energy by itself, should it be counted as giving 2 energy or giving 0?
- If not, should the net energy of each card be used instead of their cost when calculating things like the average deck cost?
Should next turn effects be counted in conservative estimates? Or are they deemed unreliable and not included?

High Variance Cards

Generally, cards can be described relatively easily with an upper and lower bound. However, the performance of some cards depend heavily on their deck, and it becomes extremely difficult to estimate their effectiveness, as there is no upper and lower bound to them. They include:

Body Slam - theoretical limit of 0-999
Catalyst - unbounded potential
Choke - heavily dependent on the amount of cards the player is able to play.
Finisher - heavily dependent on the amount of cards the player is able to play.
Thunder Strike and Blizzard - heavily depend on the amount of orbs the player channels.
Tantrum, Flurry of Blows and Weave - though consistent by themselves, these cards can be played multiple times in one cycle as they can retrieve themselves. Should their potential simply be multiplied in the optimistic estimate? Or are there other tricks we can use?
and many more to be discovered..

The high variance of these cards make it very difficult for a good limit to be placed on the upper bound of their potential, we have two main ways to solve this:

Find a representative limit for cards with relatively lower variance (such as a 4-tick choke).
Not account for them at all.

Jerry-licious commented 3 years ago

After a bit of discussion, I've decided that tantrum is typically equivalent to around two wrath sources, so the lower bound would be 1 and upper bound will be 3. However, it's still unclear to whether this should be included.

Jerry-licious commented 3 years ago

After thinking a bit more, there's also an issue with counting next turn effects and counting net energy. Specifically, the card flying knee refunds itself, but only next turn. This just complicates things to a whole new level.

I believe that the way net energy would work is that, if a card explicitly gives you energy, and that it is possible to gain energy from playing the card, then the card will be counted as gaining energy. However, if a card's energy gain effect equates to decreasing the card's cost - such as sneaky strike and eviscerate, essentially the energy gain makes playing the card cheaper instead of granting a special benefit, then it would not be counted as gaining energy.

In the discussion about next turn shenanigans, I believe that if cards gain energy next turn, that energy can be put into something completely different, so it would count as gaining energy despite previous investment.

casey-c commented 3 years ago

lots of good points here: i think we should track the shiv count in the backend but probably don't need to distinguish it from damage on the frontend. (i imagine the distinction will not be that necessary to seasoned players who would be interested in this mod anyway; for them i think they can reason about the number of shiv cards they have against something like gremlin nob or time eater or whatever without needing us to spell it completely out. rolling it into total damage seems fine to me).

i think it's more than okay to have an "exception" category, where estimates are just too difficult to come up with (i.e. too incredibly deck dependent like catalyst or body slam), and maybe put them in a special notes area or something where we single them out for "manual review". alternatively (or in addition to that idea), we could come up with an arbitrary formula for quantifying the value of these cards, e.g. assume catalyst is played exactly halfway through the first cycle (so total non catalyst poison / 2 to simulate the expected "value" of getting it in the middle of your draw pile). of course this also has some incredibly tricky problems (is this a conservative or optimistic guess? how do we calculate the value of 2, 3, 4+ catalysts?).

for energy i think i lean towards net energy (which also has a can of worms due to snecko eye) - i would put sneaky strike in a gain zero energy situation.

other cards that are incredibly tricky to quantify are things like burst or powers that scale (e.g. a noxious fumes calculation performed by the mod would be fantastically useful, but i think potentially difficult to reason about and code up due to a similar issue as catalyst). so yeah, i think for now a decent idea is just to have an exceptions category and push off handling the difficult cards for later.

casey-c commented 3 years ago

i wonder if it's worth putting another category in for these special cases. we've been brainstorming a conservative/optimistic range estimate - but i wonder if we should have a secondary sort of user configurable flag for these particular "unbounded" cards like catalyst/choke/etc. something like: keep the conservative/optimistic ranges for the bulk of the cards for which that makes sense, but for cards where it makes less sense (e.g. catalyst) have a way for users to choose in the config menu what sort of algorithm should be used for a particular card.

e.g. we could set it up to have in our config menu a dropdown menu for catalyst, with options like ("use total poison / 2", "use total poison", "use total poison / 4") with different distinct methods of calculations for a particular card. for cards like choke, these categories could be something like ("use average # playable cards - 1", "use average # playable cards - 2", "assume fixed 4", etc.). we could potentially come up with an enum that is user configurable on the fly so users can tweak it directly to their current deck in game (e.g. something like a button on the deck screen that can popup a config menu, sorta like what infomod1 did with the customizable potion indicator screen).

something like this (making special choices for these difficult cards) should be flexible enough to handle various ideas of handling the problem - but it is a LOT more work than probably worth approaching for the first release.

Jerry-licious commented 3 years ago

This is certainly a good idea, special cards can be isolated with a mark/tag that will be excluded for normal calculation and left out for special education, in addition they can be associated with special properties such as say, a card multiplier and a poison multiplier, this way they still can be accounted for without being overly needy. However, I still believe that it is just very difficult to provide a good estimate even if a good, deck-dependent calculation is made, it would be relatively inaccurate, so I would personally prefer to leave these calculations fully up to the player.

However, here are still some clever ways that we can use to provide even more accurate predictions. Remember how Infomod has a stats tab? The card plays per turn can be calculated directly and then used as a dynamic value to estimate the player's card plays.

Honestly, I think there's even a chance of vastly simplifying the process of entering values with building up abstract syntax trees.

Though these are still very complicated work, so I'd halt here.

Jerry-licious commented 3 years ago

Did a bit of math with Tantrum in my spare time, if the player does not have any additional card draw, such that all cards are drawn in chunks of hands (5 cards/7 cards with snecko), then the upper bound of tantrum is approximately equal to the amount of hands in the deck (the number of turns it takes to draw through the whole deck) minus one.

Jerry-licious commented 3 years ago

Another anomaly arises! Forethought - put one/any number of cards from your hand to the bottom of your draw pile, they cost 0 until played. The unupgraded version will be relatively easy to model, I can use essentially the same stats as setup:

{
    "draw": -1,
    "energy": {
      "conservative": 0,
      "optimistic": 3
    }
  }

Costing 1 draw to provide up to 3 energy (and no, we are not accounting for shard into meteor strike, not a chance). However, the upgraded version gets really funny, assuming that forethoughting more cards is good, then the optimistic estimate of draw is -9 and the conservative estimate of draw is 0.

{
    "draw": {
      "conservative": 0,
      "optimistic": -9
    },
    "energy": {
      "conservative": 0,
      "optimistic": 27
    }

This comes up to be a case of very large variance - thus my current proposal for a more, sane estimate would be cutting the amount of cards people would forethought down to 4.

{
    "draw": {
      "conservative": 0,
      "optimistic": -4
    },
    "energy": {
      "conservative": 0,
      "optimistic": 12
    }

Making the numbers look less crazy.

Though this again, is a card that heavily depends on its context, so maybe it's a good idea to just not care about it.

Jerry-licious commented 3 years ago

Another problem card that came up was mind blast, this one is relatively easy to do master deck size - master hand size, but it's a calculation that's a bit hard to handle, so the card will be ignored for now.

casey-c / DeckSummary

The Trouble with Counting #2

General Issues

High Variance Cards