sawagh commented 5 years ago

A Director calling into Open Match may need statistics about the current player pool size for a specific profile etc., to determine what rate to make match generation calls at.

In general, no component in the online game services system should directly connect to redis. Any information that any component may fetch from redis, we need to implemement as a first class Stats service.

Please use this issue to add any information that this stats service may need to provide.

Laremere commented 5 years ago

I see a few general approaches to a stats API:

Raw column extractor
Bucketing
Population division

Example data: ticket ID: rank, name, latency T1: 15, foo, 103ms T2: 5, bar, 81ms T3: 7, baz, 76ms

Raw Column Extractor

This is the most basic option, and is probably a good place to start. The other modes could be built on top of it. What this should do is for some given fields in properties, extract those values to be usable. Eg, requesting rank and latency would yield: [ (15, 100), (5, 75), (7, 70)]

That’s in “array of structs” format (see https://en.wikipedia.org/wiki/AOS_and_SOA) In struct of arrays format, it would be: ([15, 5, 7], [100, 75, 70])

Struct of arrays would be more efficient when transmitting (both in proto and json).

A rough idea for what the API could look like:

message StatsRequest {
  // Limits which tickets are included in the stats query.
  Pool pool = 1;

  // Which arguments should be returned.
  repeated string attributes = 2;
}

message StatsResponse {
  map<string, StatsColumn> columns = 1;
}

message StatsColumn {
  repeated float64 float_value = 1;
}

This is what is currently doable via redis by asking for all of the values in a given index (though values across multiple indexes isn’t supported).

How this should work with other datatypes is less clear. Strings properties which have lots of unique values (eg, player name) are less likely to be useful, while those with repeated values (eg, gamemode) are more likely to be useful. As such, sending the same value many times is likely to be inefficient.

Perhaps:

message StatsColumn {
  repeated float64 float_value = 1;
  repeated bool bool_value = 2;

  // The string value of the ith value is string_table[string_lookup[i]]
  repeated int64 string_lookup = 3;
  repeated int64 string_table = 4;
}

Bucketing

Repeated values leads to the second possibility: requiring buckets, and providing counts of those buckets.

Eg, get rank (in buckets of 10) and latency (in buckets of 25) would return (10 - 20, 100 - 125): 1 (0 - 10, 75 - 100): 2

Getting the right buckets would be hard: Too many buckets, and you’re getting most buckets with only 1 player; Too few buckets and you don’t have enough info on how you want to split up your population.

Population division

So, given that, you could, for example, instead ask the system “What values should I use to divide my player population into buckets of 1000.”

This is a less complete idea, but also closer to what the end logic requires. It’s also unclear, at the API level, determine how a multi-dimensional set of users should be divided.

Laremere commented 4 years ago

I haven't seen a serious request for this. Can re-evaluate once there's some requests which we can test proposals against.

lihaif commented 3 years ago

We have similar requirement to get statistics, especially the number of tickets waiting for match: https://github.com/googleforgames/open-match/issues/1312

syntxerror commented 2 years ago

Hi all, We've briefly discussed this issue and came up with a design we believe may implement this service. We talked about during the last community call and we will start building it out on a feature branch soon. We're looking for feedback on this low-level design while we compile some architecture diagrams.

Objective

To provide a service with insight into player pool statistics and to extract the useful information from that distribution of the population to be able to make significant conclusions. These conclusions are not limited to:

How long players are in queue/time to match
- The popularity of a game mode, maps, characters, etc.
Player distribution across profiles
Skill distribution
Player Matchmaking behavior
How long players are in queue before quitting
How long tickets are in a given step or the matchmaking process
How many tickets are in a given step
- E.g How many tickets are waiting for allocation?

The service will be optional as some users may not care about this. It’ll interact with several services but this will not be a core service of Open Match unless we see a big requirement from the community.

The purpose of this service is to export details on tickets since Open Match does not disclose ticket information; all things are stored in Redis. The service should not talk to Open Match but rather Firestore to reduce any potential performance issues on the matchmaking process for metrics. We believe that Open Match’s custom services would be a great approach to avoid baking this service within Open Match.

Requirements

To serve the user requesting the stats from Open-Match based on real-time data that is available in OM with respect to the tickets generated in the system which contains various factors like rank, latency, geo-region

In-Scope

Profile Reports (Frequency to be determined by developers)
Ticket Lifecycle Reports
Real-Time Metrics
Universal Query Language that targets multiple datastores
- Model “Where” clause as a grpc service and should be easy to create a protobuf
- Wrappers around “Where” clause, etc.

Out-of-Scope

Interacting with Redis
- Any functionality that requires a connection to/use Redis should not be implemented within this service

Background

There was an issue written (pre v1.0.0) that started as a thought and ideas were welcomed. The issue saw some early ideas but never gained any traction and was considered something the community didn’t have a need for. Fast forward three years and there has been a request asking for how many tickets are in queue. This was similar to the first issue where a service that provided insight into how many players are in a pool for a given profile. Originally, this was proposed to allow match generation rates to be tailored for profiles that were seeing high or low demand. Now users want insight into player pool statistics and ticket lifecycle metrics. This would be a good starting point that could develop into a service that exports relevant data on player statistics and matchmaking patterns to help users make informed decisions on their matchmakers.

Design Ideas

This design likely will not require alterations to Open Match but can require the query service unless an alternative storage solution is used. The primary consideration (for the sample) is Cloud Firestore for scalability, real-time updates, and the functionality to do complex queries. Writes to Cloud Firestore would likely happen from the Game Frontend, Director, and potentially matchfunction services. Updates to the Stats Service on tickets will include an extended field representing status (in-queue, pulled for matches, waiting for assignment, dequeued/cancelled, elapsed time, average time to match). Firestore queries based on criteria specified in profiles should yield similar results to Open Match queries to Redis.

The Stats Service will have an endpoint to handle the requests that will take ticket fields as query elements. As discussed in the community meeting, likely a wrapper around the “WHERE” clause would allow for a protobuffer to be written to handle various search field types (StringArg, DoubleArg, Tag, or custom) and an operation. Creating a universal query message will help this service work on various datastores. This service will query the second data store for status on tickets/pool size.

Functionality to Consider:

How “real-time” should service be?
- This’ll likely determine the data store that would be used.
How old data will be considered in our collection?
- Customers will have the flexibility to extract info and build dashboards
How to timely remove obsolete or old data?
- Cron jobs to delete tickets after an elapsed time?
How should data be structured? What does a common response look like?
- MapReduce on results to get expected response
- Example
How to give user an ability to query customized selection of data
Separate, not duplicate data
- If we’re exporting data out to a new service, it shouldn’t rely on being in sync with the primary/main data source
We don’t want to modify Open Match, and we have points of gathering metrics and the beginning and end of the matchmaking process within custom services
- Game Frontend provides the creation step
- Director invokes ticket consideration and assignment steps.

googleforgames / open-match

Implement Stats Service in Open Match to provide Player Pool Statistics #683