Training Set & Classification Model states are not properly persistent across restarts when using complex labels

maxgraf96 commented 2 years ago

Hi,

I found a small issue with model persistence across editor restarts (same with standalone apps): In the classification model, when I record examples for a set of labels of a custom label type (e.g. 7 different labels, all with one field) and save the trained model & training set to disk, two JSON files are created - 1 for the training set and 1 for the model. In the editor, exiting and re-starting the game works fine because the training set (and by extension the model) keeps all the recorded labels stored in the LabelCache object. However, when restart the editor, the LabelCache is not restored properly, which leads to inconsistencies between what the training set / model knows and what is stored in the label cache. For me this resulted in only 3 out of 7 labels being read correctly. My quick workaround was to also save another JSON containing the LabelCache data to disk when saving the InteractML data, and restoring that on load.

I'm on Windows 10 with UE4.27 and the latest version of iml-ue4.

Let me know if I'm doing something wrong, otherwise the workaround seems to work in all situations I've encountered so far.

Best, Max

sam-apparance commented 2 years ago

Hmm, you are right that there seems to be an oversight here, that the model state isn't fully represented by the model JSON file. The models label cache is persisted though, in the Unreal Model asset, so that must be saved too. However, there should only be a couple of specific cases where this causes problems; namely when all the following apply:

A standalone build - because packaged assets can't be saved
Using Classification or DTW - these rely on the cache to map their single input/output value
Composite Labels - cache required to store unique labels found
Needing to record/train/save at runtime - cache can't be saved out to model asset (even though JSON can save)

Does this match your use case and what you are finding?

maxgraf96 commented 2 years ago

That's exactly my use case, all 4 points apply 😄 But for now I'm good with the workaround, it's only a few lines of extra code.

sam-apparance commented 2 years ago

I've pushed a fix for this issue for you to try. Instead of an additional data file for the Label Cache it is now embedded in the model/training-set data file. This means the format has changed slightly with a top-level json object wrapping the model data and allowing for the additional (optional) label cache data alongside. I've tested this with all the demo levels, both loading their original data and loading newly recorded data and all seems good. Obviously any tools that interact with the JSON data files directly will need updating to handle the new format (a relatively simple change). It would be great if you could pull these changes and test in your scenario and let us know if it fixes your issue.

Interactml / iml-unreal

Training Set & Classification Model states are not properly persistent across restarts when using complex labels #7