VedalAI / neuro-amongus

Among Us Plugin for Neuro-sama
GNU General Public License v3.0
531 stars 50 forks source link

Sabotage data is inconsistent when training #59

Closed krogenth closed 1 year ago

krogenth commented 1 year ago

Currently, we convert all recorded data into dictionaries to avoid unnecessary load times performing conversions while defaulted data uses the data.proto_defaults definitions where applicable.

This becomes an issue when starting to convert during training, where a recorded sabotage is a dictionary goes through convert_dict, but the defaulted TaskData goes through convert_taskdata, resulting in misaligned array dimensions.

Attached is a recording with some sabotage data if necessary for testing.

133264897257692422.zip

krogenth commented 1 year ago

Looked into it more, a run without the decoded version of the gymbag2 file will run perfectly fine, but attempting to run with the decoded version already made will result in the issue described.

So we'll need to eventually normalize data between these two versions, where that should be, I don't know.

Alexejhero commented 1 year ago

Related(?): https://github.com/danielgtaylor/python-betterproto/issues/475

krogenth commented 1 year ago

Depends on what we want to have our data as. My understanding was that we wanted to be using the betterproto definitions made, but always using them was too costly to convert to. If we actually want to be using the attr_dict, we'd get unnecessary data fed into the model(mostly the id and type of the task).

Reading through the linked issue, yes, that does seem to be related.

Alexejhero commented 1 year ago

So the problem is that the two datasets are different, the easiest solution would be to load the dictionary even in the first run, when first caching the file.

This however brings a new problem, everything is just dictionaries instead of types and the convert functions lose meaning.

krogenth commented 1 year ago

We might also need to covert from proto_defaults to dict defaults as well, unless there's a non-performance heavy way to treat/convert the proto classes as dict.

Currently, if there isn't a definition, we use the default, which won't match.

Alexejhero commented 1 year ago

This is no longer a problem as we're now saving the entire Game as a pickle file.