maxspahn / gym_envs_urdf

URDF environments for gym
https://maxspahn.github.io/gym_envs_urdf/
GNU General Public License v3.0
46 stars 14 forks source link

🏆 Adding Reward Object #180

Closed behradkhadem closed 1 year ago

behradkhadem commented 1 year ago

Hello everyone, hope you're all doing well. In this PR I'm trying to implement reward shaping capability in order to use it for reinforcement learning purposes (see https://github.com/maxspahn/gym_envs_urdf/issues/179).
The way I'm looking at this problem is that reward should be an object implemented from an abstract class (called Reward) that has an abstract method called calculateReward() which has to be overridden.

This is a work in progress and I'm seeking feedback along the way.

Until now, we have the implementation of basics of Reward object without causing problems in examples. Next step for me is to access data of sensors inside implementation of calculateReward().

behradkhadem commented 1 year ago

@maxspahn Can you give a hint or reference to an example which uses sensor data from a sensor object? In this code: https://github.com/behradkhadem/gym_envs_urdf/blob/reward/examples/point_robot_full_sensor.ipynb inside third cell I have access to sensor objects, but I don't know what the meaning of its input dictionaries are. (If you want to run the code, note that you should uncomment the reward object inside the fourth cell.) Which abstract method should I use to read sensor data correctly? I tried the same thing for ObstacleSensor too but couldn't make it work.

maxspahn commented 1 year ago

Hi @behradkhadem ,

Thanks for your incentive and the suggestion on the implemantion. I am convinced by the idea, but i propose some changes. Can you allow me to push to this branch? Then I can show you my changes. Therefore, you have to tick a box on the right side in the PR that allows the maintainer to make changes to the branch of your PR.

behradkhadem commented 1 year ago

Good to hear from you again, hope you're doing well. I'm a novice when it comes to applying OOP practices in python and I appreciate your help and feedback. I personally won't be able to cooperate for the next week or two due to personal problems. But I'm looking forward to contributing to this project. And the checkbox was already checked.

Thanks.

behradkhadem commented 1 year ago

Good to hear from you again, hope you're doing well. I'm a novice when it comes to applying OOP practices in python and I appreciate your help and feedback. I personally won't be able to cooperate for the next week or two due to personal problems. But I'm looking forward to contributing to this project. And the checkbox was already checked.

Thanks.

maxspahn commented 1 year ago

I commented some of the changes I did, @behradkhadem . Let me know what you think, when you have time for it.

behradkhadem commented 1 year ago

Hello again @maxspahn!

I tried to use your sample for training an RL agent and failed. I think the cause of this problem is that in the implementation of some parts (I think inside observation dictionary) the dType of float is float64 and in some places it's float32. And while training the agent I get this error: RuntimeError: mat1 and mat2 must have the same dtype Other than that, everything seems fine.

maxspahn commented 1 year ago

Hi @behradkhadem ,

I was already afraid of a potential error with the data types. I suggest changing all the data types to native python types, because numpy floats and integers are deprecated in the newer versions. Could we do this in another PR?

behradkhadem commented 1 year ago

because numpy floats and integers are deprecated in the newer versions

Really? I didn't know that. Does NumPy methods (like matmul) work with vanilla floats and stuff? And if so, doesn't changing data types make test fail?

And yes. If you want, merge this PR and I work on changing data types in another branch.

Thanks for your efforts.

maxspahn commented 1 year ago

It was in numpy 1.20 https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations.

behradkhadem commented 1 year ago

It was in numpy 1.20 https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations.

I don't think changing data types become a huge issue (other than failing some tests). I could jump on it today or tomorrow afternoon.