nickbabcock / rrrocket

Rocket League Replay parser to JSON -- CLI app
MIT License
60 stars 7 forks source link

Feature Request: "Data Science" Output #255

Closed davechurchill closed 1 year ago

davechurchill commented 1 year ago

I am computer science professor / ai researcher, and a few colleagues / students have recently started looking to rocket league for data sets for machine learning. The output of rocket league replay parsers is quite confusing at first glance, and some considerable effort went into deciphering what various values means, etc.

What we are looking for is a JSON output that gives the following information:

The actual replay format makes this slightly annoying to extract, requiring me to go through and calculate the mapping from actorid to player names, find updated actors, and fill in frame data for which no updated actor data is found. If there was a way to output the data above directly from the program with a new flag, I think a lot of people would use it for this purpose. I do not know rust well enough to attempt this myself, and I have had some limited success with CPPRP in accomplishing this.

If you feel that this is a feature you may want to implement, I would be happy to provide a sample output JSON file with the type of data we are typically interested in for machine learning purposes.

nickbabcock commented 1 year ago

Nice project idea! I know others are interested in that same use case and also run into the wall that the raw replay data exposed is cumbersome. So it's a common request to add some additional layer to help make sense of it all. Until it shuttered, carball was the most well known project that wrapped boxcars, the library that underpins this cli.

Most recently the discussion in https://github.com/nickbabcock/boxcars/issues/150 has led to the creation of the data science oriented boxcar-frames.

The short of it is, while I'm not opposed to officially adding an ergonomic query layer on top of the replay, I'm more comfortable keeping the current state of just exposing the raw data and letting others spin off projects that are free to experiment with what is the best data model and API design. (EDIT: and then if there's ever a point where we're like "yup, this is good and everyone should have easier access to this", fold it into the main project)

davechurchill commented 1 year ago

Yeah it's one of those weird situations where "all the data is there" but it's very difficult for newcomers to parse out properly. I understand your hesitation to implement it as an official feature, no worries.

I tried going the carball route at first actually, but I always have hesitations about using libraries that are stuck to a particular older version of python. It when they I swapped to rrrocket and didn't look back for my JSON needs

nickbabcock commented 1 year ago

iirc, you received the help you wanted, but let me know if this issue is still relevant and I can re-open