stanfordnmbl / osim-rl

Reinforcement learning environments with musculoskeletal models
http://osim-rl.stanford.edu/
MIT License
888 stars 249 forks source link

Obstacle's generation #61

Closed ViktorM closed 7 years ago

ViktorM commented 7 years ago

Looking at the latest results in the leaderboard if an agent passes 3 quite regularly place spheres he can move further quit easily not meeting new obstacles anymore. May be it makes sense to generate obstacles on the whole way forward and in not so regular way they are currently generated? And difficulty 2 should have not only 3 obstacles in total, but rather 3 obstacles in average should be generated per some distance, 10 meters for example.

Also the distance the difficulty can be made to increase with the distance, for example maximum possible size of the spheres generated can increase a bit every 10 meters. It's an optional point, but make obstacles to be generated on the whole road is quite important.

kidzik commented 7 years ago

That's definitely a good point. The problem, for now, is that there is a performance issue when obstacles are regenerated. Due to the current structure of the physics model (not optimized for changes in the model during simulations), obstacles have to be regenerated each time and it takes long. You can try with:

env = RunEnv(max_obstacles = 20)

You can notice that calling reset() multiple times takes much more time than in the environment with 3 or no obstacles.

Therefore we compromised a number of obstacles. It also seems late for non-critical changes to the challenge. We might, however, reconsider this for the second round if there are no objections from the qualified participants. Of course, in that case, we will provide the environment in advance.

kidzik commented 7 years ago

We prepared difficulty = 3 in the new release candidate https://github.com/stanfordnmbl/osim-rl/tree/ver1.5 This will be further tested and potentially used in the second round (to be determined yet). To enable the new difficulty level, run

env = RunEnv(max_obstacles = 10)
env.reset(difficulty = 3)

Note that default max_obstacles in the RunEnv is still 3 due to some performance issues, which we are still investigating

AdamStelmaszczyk commented 7 years ago

Being able to see only the next obstacle once obstacle.x + obstacle.r < pelvis.x and without spacing between obstacles leads to situations like this:

There is a small obstacle at e.g. x = 1 and r = 0.1, agent sees it. However, there is a giant second obstacle at x = 1.05. Agent will never see it, there is no way for agent to differentiate between this unlucky situation and the more fortunate one (without "hidden" obstacles).

This will probably have even more impact on difficulty 3.

Fix ideas: it could see the next obstacle once obstacle.x < pelvis.x and/or some spacing between obstacles e.g. 0.5 and/or have more observations.

kidzik commented 7 years ago

That's exactly how obstacles are generated at difficulty = 3: https://github.com/stanfordnmbl/osim-rl/blob/ver1.5/osim/env/run.py#L246 minimal distance between obstacles is 0.5

ViktorM commented 7 years ago

@kidzik I took a look into the code for difficulty at level 3 - looks nice. But seems like obstacles can be placed too regular with maximum distance of 1.5 between them. It can be more interesting and spectacular to have more irregular structure for them. For example to have groups of 2-5 obstacles (random number in that range) placed at distance (0.5, 1.5) from each other as it currently is, but with larger distance between groups - say in range (2.0, 4.0). It allows agents to show not obstacles surpassing but also running the short flat distances too and also to reduce a number of used obstacles from 20 to 15 for example if their generation is quite slow.

And the 2nd suggestion is to make random radius of the generated obstacles slowly grow with their number. For example starting with the current value 0.05 or a bit smaller 0.045 to 0.06.

syllogismos commented 7 years ago

Imgur

I also see obstacles inside obstacles. Don't know bad or good. Just pointing it out.

Also is there a possibility of decreasing done condition for the pelvis height before the final release? Decreasing the pelvis height condition from 0.65 might even result in better solutions for 'Learning how to run'

ViktorM commented 7 years ago

To summarize, you can hit a few goals in the same time - to make level 3 difficulty as much as possible compatible with the cuŕrent env with difficulty 2, if you make the first group of obstacles totally the same - 3 obstacles in range 1-5 with the same r=0.05 and then each next group create with increased r with some step, r=0.052, r=0.054 and so on. And make it more diverce and interesting.

And +1 to the request for decreasing pelvis heght condition a bit, at least to 0.6.

kidzik commented 7 years ago

@syllogismos thanks for pointing out! Is this difficulty=3? While this can happen in the old version, it shouldn't be possible in difficulty=3.

@ViktorM indeed, adding extra obstacles after the first group of obstacles makes sense

Definitely changing many things seem make sense now when we learned a lot about the environment and solutions. However, we will change as little as possible for compatibility reasons. After all, rules are exactly the same for everyone, so the details don't matter so much (other than having more captivating videos...).

We will definitely incorporate many changes in the environment after the challenge.

syllogismos commented 7 years ago

Its in the old version. Haven't checked out difficulty 3. Not even aware of difficulty level 3 till today.

kidzik commented 7 years ago

That's right, many participants who don't follow github issues wouldn't even know about these tweaks. This motivates reducing the number of changes to the absolute minimum. We are preparing difficulty=3 only for the second round of the challenge.

syllogismos commented 7 years ago

I agree completely that rules are same for everyone regarding the 0.65 height change. And it makes sense to keep the changes very minimal.

I'm only saying it might get better results for the Running challenge and this restriction might hinder better gaits that actually make the humanoid run the farthest.

AdamStelmaszczyk commented 7 years ago

@kidzik How much time roughly will be given to submit in the second round of the challenge (top 10 finals)?

kidzik commented 7 years ago

We don't have the exact protocol yet, but the number of submissions will be reduced to minimum (probably just 1-3 submissions in total). The point is to test generalizability to unseen environments -- we want to avoid overfitting to the new parameters.

ViktorM commented 7 years ago

Hi @kidzik, are there any updates on the 1.5 release?