Linux based environment

maystroh commented 5 years ago

Can you please give us access to a linux based executable for your environment? You are only providing a MacOS and Win10 versions.

Sohojoe commented 5 years ago

@maystroh - Sure! I believe I will be able to target and build Linux but I may need your help testing. I've got a revision paper deadline over the weekend so I may not be able to get to this until early next week.

maystroh commented 5 years ago

@Sohojoe Thanks. Sure! I will be the tester for the Linux build.

Sohojoe commented 5 years ago

@maystroh -Great.

I've built and uploaded the Linux version - see https://github.com/Sohojoe/MarathonEnvsBaselines/releases/tag/v1.0.0 - download them and put them into a folder named 'env' from the root of MarathonEnvsBaselines code

pull the latest develop code - hopefully, it should work!!

the code that checks for the platform is in UnityVecEnv.py line 36

        elif psutil.LINUX:
            env_path = os.path.join('envs', env_name)

Let me know how you get on!

maystroh commented 5 years ago

@Sohojoe Thanks for your effort. I could actually launch a training using your Linux versions on my local machine. But, I'm not able to get it work using my docker image. In fact, I'm using your code with some minor modifications but I always get this error using my docker image:

Traceback (most recent call last):
  File "sb_train.py", line 128, in <module>
    env = UnityEnv(env_path, no_graphics=True, use_visual=False)
  File "~/Unity-Gym/gym-unity/gym_unity/envs/unity_env.py", line 34, in __init__
    self._env = UnityEnvironment(environment_filename, worker_id, visual_obs=use_visual, no_graphics=no_graphics)
  File ~/Unity-Gym/ml-agents/mlagents/envs/environment.py", line 58, in __init__
    self.executable_launcher(file_name, visual_obs, no_graphics)
  File "~/Unity-Gym/ml-agents/mlagents/envs/environment.py", line 188, in executable_launcher
    self.proc1 = subprocess.Popen([launch_string,'-nographics', '-batchmode','--port', str(self.port)])
  File "/opt/conda/lib/python3.6/subprocess.py", line 709, in __init__
    restore_signals, start_new_session)
  File "/opt/conda/lib/python3.6/subprocess.py", line 1344, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: '~/Unity-Gym/test_marathon_envs/envs/hopper.x86': '~/Unity-Gym/test_marathon_envs/envs/hopper.x86'
srun: error: cn11: task 0: Exited with exit code 1

I could not figure out why the code can't find out the envs especially that the same code works on my local machine (without passing by the docker image). It might appear like an easy problem but the thing is that I tried to put an environment I built in 'env' folder (in the same folder as your environments) and then the code works like a sharm. I'm still double checking.. so everything looks good so far.

maystroh commented 5 years ago

@Sohojoe . Can u please describe a little bit the reward strategy you are applying in the environment?

Sohojoe commented 5 years ago

Sure - This is for Hopper: does it help?

velocity - the main reward signal for positive movement to the right uprightBonus - reward signal for keeping the pelvis upright (helps a little with early training) effort / effortPenality - negative reward signal to reduce the amount of energy used jointsAtLimitPenality - negative reward signal to reduce extream motor use (used in robotics to limit reduce wear on motors)

        float uprightBonus = GetForwardBonus("pelvis");
        float velocity = GetVelocity("pelvis");
        float effort = GetEffort();
        var effortPenality = 3e-1f * (float) effort;
        var jointsAtLimitPenality = GetJointsAtLimitPenality() * 4;

        var reward = velocity
                     + uprightBonus
                     - effortPenality
                     - jointsAtLimitPenality;

I also have a tutorial for the main MarathonEnvirorments code base: https://towardsdatascience.com/gettingstartedwithmarathonenvs-v0-5-0a-c1054a0b540c

Sohojoe / MarathonEnvsBaselines

Linux based environment #8