VedalAI / neuro-amongus

Among Us Plugin for Neuro-sama
GNU General Public License v3.0
540 stars 49 forks source link

Requesting information about recording #14

Closed Alexejhero closed 1 year ago

Alexejhero commented 1 year ago

The Recorder class takes the current game state and serializes it. How do you plan on using the information obtained from it? (I assume for training?)

It would be useful to know the purpose of this class, as well as the design process behind it, so we can maybe add more fields to the frame if they are relevant.

js6pak commented 1 year ago

Is there really no more efficient way than saving everything every frame? We might also want to use a simple custom binary format instead JSON.

JohnyDaison commented 1 year ago

Well, if you wanted to optimize, you could only save the values which changed since last frame. That would reduce the file size, but Vedal would need an unpacking script on his end to get back to original data for training.

Vedal987 commented 1 year ago

The plan is to use this data for training the neural network yes, I made a rough draft on stream of what the inputs/outputs of the network might look like (weird formatting but inputs on the left and outputs on the right):

image

We don't want to limit ourselves to just this data though in case we want to change something later on and don't want everyone to have to rerecord data.

Once we have this implemented, the idea is that viewers will use this plugin to record their gameplay to crowd source data to train Neuro with.

As for how to store this data, I agree we should probably use a more efficient binary store instead of JSON, would be nice if it is easily loadable into Python too since that's what the data will be read in.

ScrubN commented 1 year ago

Once we have this implemented, the idea is that viewers will use this plugin to record their gameplay to crowd source data to train Neuro with.

As for how to store this data, I agree we should probably use a more efficient binary store instead of JSON, would be nice if it is easily loadable into Python too since that's what the data will be read in.

https://msgpack.org/MessagePack could be a good open alternative to JSON. It has implementation in a stupidly wide amount of languages including Python and C#. In addition to that, we could implement a button or something to automatically pack many sessions into a single GZip file to further reduce file sizes.

Here is a C# implementation that advertises Unity and seems is still being maintained: https://github.com/neuecc/MessagePack-CSharp As for Python, here is an implementation that advertises Python3, since the official MsgPack python lib only advertises Python2: https://github.com/vsergeev/u-msgpack-python

image The msgpack website reports this sample data to be 31% smaller as MsgPack instead of JSON, which after looking at the protocol specification is an over-estimate due to the website JavaScript interpreting the float32s as float64.


Otherwise we could just use BSON, which in theory would be slightly larger than MsgPack. Seeing as we know our specific types work in MsgPack, as long as we don't have more than 4,294,967,295 frames in one array (397.7 hours@50 frames/sec) then MsgPack seems like the superior option.

JohnyDaison commented 1 year ago

The website link is dead now, seems to have moved here: https://msgpack.org/