Balint-H / mj-unity-tutorial

Introductory set of tutorials for using the Unity plugin of MuJoCo with the ML-Agents framework.
34 stars 0 forks source link

Fewer observations than vector size #3

Closed DinoDany closed 1 year ago

DinoDany commented 1 year ago

Hello, hello! I'm trying to run the 3rd part of the tutorial (RL Basics). I am getting these warnings:

image

And I think they are not letting me make a build. I'm also not sure what behavior to expect at this part of the tutorial.

Thank you so much!

Balint-H commented 1 year ago

Hello, RL Basics actually has a reference scene inside Assets\Tutorials\3 - RL Basics\Scenes\CartPoleTutorialFinished. There was an issue preventing builds due to a couple tutorial related scripts not being marked as editor only, which I fixed to enable training. I also noticed the agent not having a max steps added in that scene which is necessary for properly training. I also added a reference to this in the text of tutorial 3 (RL Basics) page 2/7. These changes have been pushed just now.

In terms of number of observations, if you subclass SensorComponent then you would not expect needing to specify observations. However if you add the observations manually in an Agent subclass, then you need to update the behaviour parameters accordingly. The observation size in behaviour parameters only needs to match the vector observation length added inside the Agent's own behaviour.

Let me know if these helped!

EDIT: I've found that the hinge observations were missing from that example scene, my bad! Added and pushed now, along with a bit extra guidance on running a training at the end of tutorial 3.

DinoDany commented 1 year ago

Hello, hello! I'm sorry it took me so long for me to engage again in the conversation.

I can see the edits. Thank you so much! I'm still not able to reproduce the tutorial. It seems to be a problem with the behavior type. Default and Heuristic give the same warning and Interference shows this new one:

image

I imported the DebugGUI graph but I'm not sure how to use it(:

I have a couple new general questions:

I'm trying to understand the dynamic between the ML Agents and MuJoCo. So the Mujoco scene is set with Mj actuators and Mj sensors, and then we need to map the UnityML parallels with the ISensor and IActuator interface, right? It is creating a flow like this:

Scene -> MjSensor -> ISensor -> ML model -> IActuator -> MjActuator -> Scene

Am I correct?

Also, Is there a reason why the scene has so many planes in the environment?

Thank you so much for all your help!

Balint-H commented 1 year ago

Sorry for the delay in response!

Do you get the issue in the finished example scene as well?

In terms of structure of interfaces between MuJoCo and MLAgents you have a good understanding. Perhaps one addition is that you dont necessarily need MjSensors and MjActuators, theres the option of reading and setting data directly from the MjScene. You could further strip away the interface objects as well, and perform their duties in a custom agent class, but the well organized setup you describe is I think more suitable.

The planes in the scene don't have MjGeoms attached to them (if I remember correctly). They are solely visual elements that hold the grid texture for observing the environment. They should be grouped together under one GameObject so they don't take up space in the hierarchy.

DebugGui is a good tool to visualize changing values. You can define a field in a class you want to track:

[DebugGuiGraph()]
float valueToTrack;

You can define many options inside the DebugGuiGraph attribute, like color or limits.

Then update the tracked value e.g. every frame to match the current reward, or sensor reading. Note that it does not work with double values, they need to be converted first to float.

Then you also need to add a DebugGui component somewhere in your scene. After that the live graphs should be visible in your game view during play mode (not the scene view).

Balint-H commented 1 year ago

Also inference showing an error is the expected behaviour. Once you confirm you get no errors with Default mode, you need to build and run training. Then you can supply the learned model to the agent and use inference mode.

DinoDany commented 1 year ago

Hellooo! I see. I was able to run the finished example!!! It is training right now. Thank you so much! Also thank you for the explanation.

DinoDany commented 1 year ago

I still haven't been able to run the scene in the tutorial, but I'm working on that! Thank you so much.