[RFC] RLMM Analytics - Githubissues

orrybaram commented 3 weeks ago

Summary

This is a proposal to add an event tracking system to MEP. This will help us gather data on player actions for instance level completion, fails, time spent, etc. The goal is to better understand how players use the maps we build and let us as mapmakers improve our projects based on real user behavior.

Why Do This?

Right now, aside from catching a streamer on twitch or finding a youtube video, we have 0 visibility into how users interact with the product. With event tracking, we can:

See how often a map is played
Understand where users struggle.
See which features get used the most.
Use data to drive level-design decisions.

How It Will Work

2 Commands will be added to the mepcommand interface

`mepcommand trackinit <API_KEY> <PLAYER_ID>`

`mepcommand track <EVENT_JSON>`

We’ll add a new class EventTracker to MEP to facilitate tracking:

It will have 2 public methods

initialize
track

initialize(api_key: string, user_id: string) Takes an api_key and user_id and saves them as private vars to be used in future track events

track(event: string) Takes a stringified JSON object containing event data and send a request to an API

// Examples
// These event will be defined by the map maker, although we can provide defaults such as `Map Loaded`
{
    "event_name": "Map loaded",
    "event_data": {}
}
{
    "event_name": "Level Complete",
    "event_data": {
        "level": 1,
        "timestamp": 12324124
    }
}

Mock implementation

#include <iostream>
#include <string>
#include <curl/curl.h>

class EventTracker {
private:
    std::string api_key;
    std::string user_id;

public:
    // Initialize the tracker with API key and user ID
    void initialize(const std::string& apiKey, const std::string& userId) {
        api_key = apiKey;
        user_id = userId;
    }

    // Track an event by sending a POST request with event data
    bool track(const std::string& eventData) {
        CURL* curl;
        CURLcode res;

        curl = curl_easy_init(); // Initialize libcurl
        if(curl) {
            // API endpoint for tracking events
            const std::string api_url = "https://your-api-url.com/track";

            // Construct the full JSON payload including event data
            std::string payload = "{\"event_data\":" + eventData + ", \"user_id\":\"" + user_id + "\"}";

            // Set the target URL and payload
            curl_easy_setopt(curl, CURLOPT_URL, api_url.c_str());
            curl_easy_setopt(curl, CURLOPT_POSTFIELDS, payload.c_str());

            // Set the headers (including the API key in the Authorization header)
            struct curl_slist* headers = NULL;
            headers = curl_slist_append(headers, ("Authorization: Bearer " + api_key).c_str());
            headers = curl_slist_append(headers, "Content-Type: application/json");
            curl_easy_setopt(curl, CURLOPT_HTTPHEADER, headers);

            // Perform the request and check for errors
            res = curl_easy_perform(curl);
            if(res != CURLE_OK) {
                std::cerr << "curl_easy_perform() failed: " << curl_easy_strerror(res) << std::endl;
                curl_easy_cleanup(curl);
                return false;
            }

            // Clean up
            curl_easy_cleanup(curl);
            return true;
        }
        return false;
    }
};

Security Considerations

API Key We will have an interface for map makers to generate an API key for specific maps so they can correlate data to each of their projects. Because of the nature of map making, these keys are unfortunately not entirely secure. This might be ok though, since the attack vector is somewhat small and having fake event data is not the end of the world. Would love ideas on how we can potentially beef up security here without compromising ease of use.

Async & Batching: We might consider batching events to reduce network load. Rate Limits: To prevent spam, we’ll limit the number of events a user can send in a given time in the API Layer

Next Steps

Create an API to handle track events, store event data, and allow users to generate api keys. Build and integrate the SDK into the project. Provide documentation on how to setup analytics in a map

blaku-rl commented 3 weeks ago

I'm a huge fan of adding this in. It would be a massive boon for map makers.

I do have a question on the user_id value that will track individuals. Would this be something the plugin would automatically grab for the user or would a user have to supply their own "key"? I can get a users epic id from bakkesmod and pass that along as the user_id. The only concern I have with doing it this way, is that anybody can look up another users epic id and send false data for them. This is a very minor concern for the scope of this for sure.

The alternative would be for them to also generate a key as a user and supply that to the plugin. That would make this an opt-in feature for the users and decrease the overall usage, but they would be giving their explicit consent to tracking analytics. We could also provide some data visualizations for them as an incentive to create an account.

As far as the API key for a map, I really don't see any other way of providing it. I think the best that can be done is to obscure it with flash. If a map maker is able to regenerate a key for a map, they could update it if the API is being flooded with junk data. Having a userid that is managed by the API would also allow for tracking users that are potentially misusing the API.

Overall this will be a great enhancement for MEP.

orrybaram commented 3 weeks ago

Would this be something the plugin would automatically grab for the user or would a user have to supply their own "key"?

My initial assumption was that the map maker would pull it via kismet, but if we can get it directly in MEP that would make things a lot easier. I hadn't initially considered that as a possibility but that's definitely the best way to go IMO. The main reasons for having the user_id would be to correlate events and then optionally to allow the user to access their data via some sort of dashboard.

The alternative would be for them to also generate a key as a user and supply that to the plugin

I like this from a data integrity and ethical standpoint, but I agree that this is a big barrier and will greatly reduce the amount of useful data we would get back. I do think it would be our responsibility as map makers to make this opt-out though. I think it would be a good idea to put together an actionscript package that acts as an SDK and also contains a data disclosure confirmation modal. I already have the basis for a package that contains MEP bindings for saving/loading data that has a modal component and controller/keyboard support that could easily be extended to add this in as well.

If a map maker is able to regenerate a key for a map, they could update it if the API is being flooded with junk data

Yeah this was my thinking as well. I think since we're in such a niche category and the data doesn't have any PII it's not a huge security concern. Aside from junk data another attack vector would be DDOS but that could be mitigated with rate limiting and key rotations (if necessary)

ghostrider-05 commented 3 weeks ago

Security

The only concern I have with doing it this way, is that anybody can look up another users epic id and send false data for them. This is a very minor concern for the scope of this for sure.

I don't think there will be a lot of users going into UDK / trace network requests just to send junk data, but it is a good idea to use flash for this to make it harder to get the API key.

I think the easiest method is to have a global key for all maps in flash and use a project name, author name and current map version (also use the match GUID?) to sign requests and send the signature in the header. The more secure way is to generate a key for each map and use the same data to create a signature + public key associated with the key. Map makers can register keys easily then in a portal such that the server can verify the signatures and allows them to create a new key for a new version of the map, although it might be easier to only configure stuff in kismet for map makers.

I suggest to use the second method, which is also what I read in the other comments if I'm right.

I do think it would be our responsibility as map makers to make this opt-out though. I think it would be a good idea to put together an actionscript package that acts as an SDK and also contains a data disclosure confirmation modal.

+1 for an opt-out method. Some way of informing that the map maker is using statistics and what it can collect is a good idea. Maybe we should also have a method to create a "delete all my playtime statistics" for current / all maps? Could also be nice for users to have place to see their own statistics on workshop maps?

Aside from junk data another attack vector would be DDOS but that could be mitigated with rate limiting and key rotations (if necessary)

Don't think it is needed and might be hard to use rate limiting since we have no idea how much events we will get, especially if a popular map is just released you don't want to limit real events.

Events

[!NOTE] I'm not including WRs or anything speedrun related to this

The Steam API has fields for tracking playtime, but for some reason it is always empty. And since we can implement more details, we can also communicate to sponsors how many people have seen and played the map.

interface SteamWorkshopFile {
     lifetime_playtime: string;
     lifetime_playtime_sessions: string;
}

To reduce the amount of events, maybe we can collect all session data into one event. I assume that we can also only use events when you're the host of the map (for multiplayer maps only) or do we want to have every player send events?

interface PlaytimeSession {
    match_id: string;
    // Are there other types of playing workshop maps?
    match_type: 'training' | 'local' | 'steam'
    map_version: string;
    players: {
        // To track unique players and get playtime stats using the start and end time of the match / player
        id: string;
        platform: 'steam' | 'epic'
        is_host?: boolean;
        // If not host and the player joins late / leaves early
        join_timestamp?: string;
        leave_timestamp?: string;
        // Add player stats?
        statistics: {}
    }[]
    events: {
        // For custom events, what the event is
        name: string;
        // For game events, an enum of the event
        id: number
        // 'game': for default game events, such as map loaded, match ended, more??
        // Can also include goal scored and more if the map maker wants to track that 
        // 'tracked': for custom events, such as checkpoint reached, resets, etc. 
        type: 'tracked' | 'game'
        timestamp: string;
        // Custom parameters from kismet / data is used in game events
        params: {
            [name: string]: any
        }
    }[]
}

Then we can collect events using the proposed MEP command and send everything at the end of the match. Maybe stringified objects are not the easiest to use in kismet and we can transform it on the API / plugin side.

API

I can ask for a subdomain to have an endpoint like https://analytics.rocketleaguemapmaking.com/{track enpoint} and create a repo for the API on the rlmm org to collaborate. I don't know if anyone already made something, but I'm open to help with the API. I can also host on it on my Cloudflare account to enable DDOS protection and have access to some paid platform features while costing likely nothing.

Endpoints

If an endpoint is only for the map maker / user, I suggest to use Epic Oauth to have the user log in to the portal

POST /api/sessions: tracks a new session with a PlaytimeSession body POST /api/maps: registers a new map. Only for map maker GET /api/maps/:mapid/sessions: gets alls sessions for a map, paginated by a certain amount of sessions and can be filtered. Only for map maker GET /api/maps/:mapid/events: gets alls events for a map, paginated by a certain amount of events and can be filtered by event types and attributes. Only for map maker GET /api/maps/:mapid: gets all information about the map, such as playtime, unique sessions, total amount of events. Only for map maker

I'm not including the API for users to see and manage their playtime stats.

orrybaram commented 3 weeks ago

@ghostrider-05 Great insights here and I think we're totally aligned. I haven't created the portal/API yet, but I did just now setup a project with supabase to get something going. I think having it as part of rocketleaguemapmaking.com would be amazing though and would love to collaborate on this. I'm more interested in building out the UI/dashboards if you would wanna take on creating the API.

orrybaram commented 3 weeks ago

To reduce the amount of events, maybe we can collect all session data into one event. I assume that we can also only use events when you're the host of the map (for multiplayer maps only) or do we want to have every player send events?

This would work well for matches, but for single player maps is there a similar hook we can use to know that a session has ended? I was thinking we could potentially batch the requests in MEP before sending out a larger event payload. We could use a simple debounce and set it to 5s for example and send an array of all the events collected in those 5s.

orrybaram commented 3 weeks ago

Maybe we should also have a method to create a "delete all my playtime statistics" for current / all maps? Could also be nice for users to have place to see their own statistics on workshop maps?

Could be a good idea maybe as a project on the roadmap. We could build that into the dashboard

ghostrider-05 commented 3 weeks ago

Maybe we should also have a method to create a "delete all my playtime statistics" for current / all maps? Could also be nice for users to have place to see their own statistics on workshop maps?

Could be a good idea maybe as a project on the roadmap. We could build that into the dashboard

Maybe we should track features for the dashboard / portal somewhere else and use this for the plugin side? Could also the Discord thread for both discussions

blaku-rl commented 3 weeks ago

I do think it would be our responsibility as map makers to make this opt-out though. I think it would be a good idea to put together an actionscript package that acts as an SDK and also contains a data disclosure confirmation modal.

That would be amazing. As long as the user is informed about the data analytics, I'm comfortable with adding this feature. Especially with an actionscript modal that can clearly state this and be an easy drop in to maps for mapmakers.

I assume that we can also only use events when you're the host of the map (for multiplayer maps only) or do we want to have every player send events?

The host is the only player allowed to interact with kismet in the lobby. If the host is given information for specific players to send, I can pass along a message using netcode in the plugin to send an event from a specific player. That would mean the command in multiplayer would look something like this

`mepcommand trackinit <API_KEY> <PLAYER_ID>`

`mepcommand track <PLAYER_ID> <EVENT_JSON>`

The PLAYER_ID value would now use the ephemeral match id found at Player > PRI > PlayerID and the plugin would get the correct epic id based on that. This would now mean that the host would have to track everything for all the players. I'm not sure how tedious that will be for a map maker of multiplayer maps, but it is an option.

This would work well for matches, but for single player maps is there a similar hook we can use to know that a session has ended?

The plugin already does some cleanup when a map is unloaded, so I could send a request here that signals the map has been closed. On a semi-related note, one feature I do want to add to the plugin is allowing a map to hook into a wide variety of events. In bakkesmod, we can hook into any function that is called by rocket league as long as we know the name of it. I'd like to allow map makers to do the same. One meme example of how this could be used is when a user receives a party invite, then map shows a pop up that says You want to leave me :(. This would allow for fine grain control of events in kismet and could be used to help with event tracking that is useful for map makers.

I was thinking we could potentially batch the requests in MEP before sending out a larger event payload. We could use a simple debounce and set it to 5s for example and send an array of all the events collected in those 5s.

I do like queuing up the events and sending a batch request. That can definitely be done.

One thing I do want to know is what sort of output, if any, should the plugin provide to the map after sending a request. With the new SRC integration, I'm introducing 2 new mep items for map makers to use. The first is a string variable called mepoutput that will contain a response based on the request that was made. The second is a remote event called 'MEPOutputEvent` that will trigger after the 'mepoutput' string variable has been set. Using this flow, we can relay any relevant information to the map.

ghostrider-05 commented 3 weeks ago

The host is the only player allowed to interact with kismet in the lobby. If the host is given information for specific players to send, I can pass along a message using netcode in the plugin to send an event from a specific player.

I made this comment to ensure that only one player is sending events. If an event is for a specific player that could be passed in the params. Then the commands can be:

`mepcommand trackinit <API_KEY> <PLAYER_ID>`

`mepcommand track <EVENT_JSON>`

Why is PLAYER_ID needed in the init command? Can't that be found in the plugin.
I think it is best to split the json into a name and params and format the params like map launch arguments to be easy to concatenate in kismet: 'checkpoint=' + '1' + '&resets=' + '0'

On a semi-related note, one feature I do want to add to the plugin is allowing a map to hook into a wide variety of events.

That would be nice to have for both tracking certain in-game events and for non-analytics map features

One thing I do want to know is what sort of output, if any, should the plugin provide to the map after sending a request.

The map maker is sending information to the plugin, so sending back the same information seems useless to me. Maybe a confirmation that the trace is sent or failed? If we want the ability the sent HTTP requests to get some data then an output variable / event is needed.

orrybaram commented 3 weeks ago

Why is PLAYER_ID needed in the init command? Can't that be found in the plugin.

I don't think we need it anymore, that was in my original design before I realized that MEP can pull it

I think it is best to split the json into a name and params and format the params like map launch arguments to be easy to concatenate in kismet: 'checkpoint=' + '1' + '&resets=' + '0'

This makes it easier to send directly from kismet, but if we provide an actionscript sdk to send events it might not matter? I guess on the other hand we can include a conversion from JSON to query params in the sdk.

The map maker is sending information to the plugin, so sending back the same information seems useless to me. Maybe a confirmation that the trace is sent or failed? If we want the ability the sent HTTP requests to get some data then an output variable / event is needed.

Yeah I agree with this, I don't think we would get much useful info back from a track event. I would throw a boolean ok as a response but, I don't think there's much a map creator can do with that info.

orrybaram commented 3 weeks ago

Oh, thought of another thing last night. If it doesn't already exist, we should generate a unique session id that we can pass with each event. That way we can correlate events to a particular session that haven't been added during the session end (mep cleanup) phase

ghostrider-05 commented 2 weeks ago

I've been working a bit on the API that will send events that can be used in the dashboard.

My current draft (a lot simpler than my previous comment) is:

Method: POST Path: /api/v1/projects/:id/sessions Authentication: bearer token (JWT) Body:

interface Session {
   id: string; // new uuid
   params: { [name: string]: any } // Can be a combination of plugin and user defined params
}

Then events can be batched and sent:

Method: POST Path: /api/v1/projects/:id/sessions/:sessionid/events Authentication: bearer token (JWT) - same as before Body:

type Events = {
   name: string;
   timestamp: string;
   player_id: string; // The Epic id of the player that triggered this event
   params: { [name: string]: any } // Can be a combination of plugin and user defined params
}[]

Then end the session:

Method: POST Path: /api/v1/projects/:id/sessions/:sessionid/expire Authentication: bearer token (JWT) - same as before

The token can be found in the dashboard of the project and:

has no end time
can be revoked in the dashboard
is a JWT token that has no secret data, but can be verified by the API
has only permissions to create events for the project, nothing else in the API
there can be multiple tokens / project. E.g. one for beta testing, a special event, new version, etc. That also allows the map maker to track events / version (aka token) or revoke tokens when needed (not making it required).

Now it's easy for me to make changes, e.g. if there is something missing or not possible at all

Edit 15-11:

/expire path does not accept a body. Players will be only users that send events.

Limitations:

a batch of event can be no longer than 20
each event can contain at most 5000 bytes of strings (param keys included)
each event and session can contain at most 15 string and 15 number params
~~params and individual events will be stored only for 3 months. Some stats will be combined and stored with the session~~

orrybaram commented 2 weeks ago

Looks good to me, the only other required field I would add would be userId on the Event

blaku-rl / MapExpansionPlugin

[RFC] RLMM Analytics #2

Summary

Why Do This?

How It Will Work

Mock implementation

Security Considerations

Next Steps

Security

Events

API

Endpoints