Global Stats: Improving transparency and basic usage

ajrdesign commented 6 years ago

https://www.reddit.com/r/Guildwars2/comments/844a6f/sc_benchmarks_vs_gw2_raidar/

There's a lot of feedback here about how global stats can be misleading. Obviously a lot of it is user error, but my job is to make sure that those are minimized. I think there's a couple problems and some potential solutions:

Problem 1: All/All filter is too macro to be useful.

Raids are obviously our primary use case, with fractals being secondary and benchmarking with golems being tertiary. I think giving a default experience of simply All Raids/Current Era would as well as having a "All Fractals" and "All Golems" option to see those at a high level.

Question: Should we exclude CM from "All Raids"? I've never done CMs so I'm not sure how much they'd be skewing data or not. Additionally should certain encounters be excluded entirely? This is done a lot on warcraft logs when particular fights are being cheesed to drive numbers up or have an insane amount of padding possible, the one case where this might be applicable is KC because of the huge damage boost power gets there.

Problem 2: Comparing small sample sizes to large at the same level.

Popularity is very rough gauge of trying to determine a "sample size" but it's not very effective and not very transparent. In addition to popularity I'd like to work in a small 4 tier system that i can indicate on the UI (tbd).

Tier 1: Large sample size, can be trusted to be an accurate representation of the archetype capabilities. Tier 2: Medium sample size, trust worthy but maybe still slightly inaccurate. Tier 3: Low sample size, maybe just has the minimum 10 logs required to be seen on Global Stats. Tier 4: Too few logs to even be visible on Global Stats

Question: As of now I don't know what kind of log counts current archetypes have beyond >10. It'd be nice to look at the average number of logs that are being uploaded in an era for some of the various archetypes to see where we want to place Tier 1 & 2.

merforga commented 6 years ago

RE: Problem 1

I think the high level we should go for all is by each content type, ie ALL RAIDS | ALL FRACTALS | ALL GOLEM. This should alleviate most of the issues with the current ALL ALL Global Stats. While only slightly more useful, it will allow for overall capability of each Raid. I don't believe ALL stats per wing so that's not something we can include in.

In terms of CM, they don't really differ too much DPS wise to the normal run. The only one where it is vastly different would be Dhuum since the current Dhuum CM run Meta is to stack on condi as opposed to power for normal, however I don't think it's too much of an issue with ALL stats as the number of successful CM runs compared to norm runs in each block of data is miniscule.

Even with KC damage boost, most groups aren't going to break higher than other normal boss runs since the bursts is confined in small 20 second windows over the entire fight, the overall DPS which is what we use is normalised in that sense. Current weaver stats aren't too out of line boss to boss imo.

RE: Problem 2

I think a simpler solution is to just include a count along side popularity.

Toeofdoom commented 6 years ago

For a front page, I've been wondering whether we really want to show dps directly at all. It's a few things like:

is popularity more important here? I.e. what are the most common power/support/Condi classes?
if we want a dps ranking at all we may want to "normalize" across encounters. I.e. weaver is first at 8 bosses, gets 80 hidden "points" (or average by encounter, or geometric average or whatever) - the intent is just to say which classes are generally the highest dps.
should we move towards a completely different overview indicating what bosses a class is "good" at?

merforga commented 6 years ago

That's a good idea. I'm not too convinced normalizing stats using arbitrary rules would be a good idea though. Potentially a list of top classes per boss by popularity / dps? Ie boss centric view to maintain consistency

Toeofdoom commented 6 years ago

If we go forward with "normalizing" damage, yeah, we'd need something that actually makes sense based on how people use the stats and avoid being arbitrary. For example we could say... The average dps for power at fight X is 12k, weaver is 15k, so weaver is 25% above average and use that as our marker if we think that would work for our users. So the page might say: Weaver: overall 9% above average, popularity 0.8/5000 logs Holosmith: overall 2% above average, popularity 0.7/4400 logs

Toeofdoom commented 6 years ago

To be clear I'm not saying we have to do that, just that if we do a sensible rule is critical

ajrdesign commented 6 years ago

I think a simpler solution is to just include a count along side popularity.

Simpler for us not necessarily simpler for users. Statistics are hard and analyzing another number in the dozens of numbers we are already showing is asking a lot when we can do something simple to show what data can be trusted and what cannot. I'd imagine for most of our users determining what is a good sample size isn't really a fun task, which is what we are asking them to do if we simply throw a log count at them.

This is my proposed design to help solve this:

I am in favor of including the log count in there (tooltip) but I think this makes the task of "What data should I trust?" a lot easier than only looking at a log count. We would just need to determine how many logs we'd deem each tier to be, which shouldn't be hard if we can somehow look at at the overall data for log counts across archetypes.

ajrdesign commented 6 years ago

I'll consider this addressed until we see more feedback.

GW2Raidar / gw2raidar