Sequence Prediction [info request]

eblabs commented 5 years ago

Hi, I'm getting a lot out of ml-agents, really great work to the team! Just have a question about how to approach sequence prediction using ml-agents.

I've created this GRU based RNN in python/keras/tensorflow before attempting to build this in Unity. It is intended to predict the pattern/timing of footsteps based on the character's speed. Its trained on data extracted from a long mocap clip.

It has a sequence of 3 inputs; previous-frame character speed previous-frame left foot Y previous-frame right foot Y

And it predicts a sequence of 2 outputs; current-frame left foot Y current-frame right foot Y

predictiongraph

you can see the left and right feet moving up/down as the player's speed increases then return to the ground (0) once the player stops moving.

Since the intent of this NN is to generate a pattern/sequence based on the history of a players speed and previous footsteps, it needs to be trained with this mocap data. I'm not sure how this fits into the control-based/reward paradigm of ml-agent training. Maybe this could work through ml-agent's Imitation Learning, but this seems to be focused on recording input from a player rather than some arbitrary training data.

At any rate, if there were any broad suggestions on how to approach this from within Unity that would be great.

Thanks everyone!!

eblabs commented 5 years ago

Ah, I had missed the Heuristic Brain. I think this might be the solution to this. Will report back after some testing... Thanks!!

eblabs commented 5 years ago

So I'm making a bit of progress with the Heuristic Brain. I'm basically using the Decision script to feedback in the 'y' portion of the training data on the Teacher Agent. I found it quite slow doing the training in online_bc, both with and without the --slow flag. I tried the offline_bc method and it looks like a demo recorder is required. Although with a Heuristic Brain, I didn't think the demo recorder would be necessary with this x/y training scenario where all the training data is known ahead of time.

I have 3 input values, and 2 output

Also using 32 Stacked Vectors

FootstepPredictorLearning:
trainer: online_bc
max_steps: 10000000
summary_freq: 500
brain_to_imitate: FootstepPredictorHeuristic
batch_size: 8
batches_per_epoch: 5
num_layers: 3
hidden_units: 128
use_recurrent: true
sequence_length: 32
time_horizon: 32
normalize: false

Thanks for any thoughts on this!

mmattar commented 5 years ago

@eblabs - just to clarify, it seems that you're just using Unity to generate the data and send it off to Python for training and that the training process in TF does not actually influence the Unity scene. Is that correct? If so, why not just record a large data set from a Unity run and save the data to file and then run the training offline. You could use the demonstration recorder, but also something that just prints out a csv file.

eblabs commented 5 years ago

Thanks for the reply @mmattar ! Yes, you are correct, I'm simply training with existing training data. The benefit here is that I can use Unity to pull data directly off of some premade fbx animation rather than first saving it into a csv file.

Just to wrap my head around how unity trains using imitation learning, Id like to confirm my assumptions. Would you mind letting me know if this is correct?

In Keras, I was training data like this: some_LSTM_sequence_model.fit(X_train,y_train,epochs=10,batch_size=32)

And in Unity the paradigm would be like this:

collect identical vectorObs from both a Teacher Agent(heuristic brain) and Student Agent(learning), this would be the "X_train" in python/tensorflow/keras.
Use a Decision script attached to the Teacher brain to echo back the expected values into the Teacher's AgentAction, this is the "y_train".
Running "mlagents-learn" trains the Student Agent's vectorAction inputs against the Teacher Agent's vectorAction inputs, this process is the same as "model.fit()"

I hope I understand this correctly! Thanks again!!

harperj commented 5 years ago

@eblabs This sounds like a reasonable approach, though I can't speak to how the behavior of our behavioral cloning trainer will compare to the model you used previously. Hopefully this worked out for you in the end.

It has been a while since any activity on this issue so I'm going to close it, but if you have any more to share or have run into any problems please feel free to reopen.

eblabs commented 5 years ago

Thanks @harperj! Things didnt quite work out how I had hoped. The offline training method ended up being much slower for some reason, and I couldn't quite achieve the same results as with using Tensor/Karas in python.

I had considered exporting the network+data from python into Unity, but I didn't have the time to get into this as from what I had read, it wasn't totally straightforward. (having correctly formatted, proper input naming, etc)

It seems like some tasks are better suited to the Unity training method. It would be nice to have some kind of guide for generating models and data outside of unity and bringing them in.

Thanks again!

lock[bot] commented 4 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

Unity-Technologies / ml-agents

Sequence Prediction [info request] #1573