AminHP / gym-anytrading

The most simple, flexible, and comprehensive OpenAI Gym trading environment (Approved by OpenAI Gym)
MIT License
2.1k stars 465 forks source link

I think there are serious issues with this ENV. #24

Closed chokosabe closed 4 years ago

chokosabe commented 4 years ago

I was writing tests for this and its becoming more and more clear this gym has some serious deficiencies. I dont think anyone should be using it in production and your READM ideally would reflect that. At a base level the only 2 actions and states are long or short which is very wrong and messes with whatever algorithm is being used to train. Many algorithms depend on a gaussian action space. i.e -1 or [0, 0, 1], 0 or [0, 0, 0], 1 or [1, 0, 0].

AminHP commented 4 years ago

Hi @chokosabe.

First of all, this is not a complete framework that can be used in production environments. This is a lightweight development tool for researchers and traders to test their own algorithms as simple/fast as possible.

Generally, my purpose in creating this env was to build an extendable env that everyone can extend it easily according to their needs. Therefore, it's not so important that there are only 2 actions in this library, the important thing is that you can fork it and add your custom actions in a day. However, I still think only 2 actions are enough for starting to train an RL agent and focusing on improving the agent's behavior.

About your last sentence, you can write two functions to convert the actions to one-hot (or other formats) and vice-versa. It's not a big deal!

chokosabe commented 4 years ago

The problems extend far beyond what I mentioned above. You have things like _calculate_reward in StocksEnv only updating if the current position is Long (which isn't at all representative). The env needs quite a bit of work before you can really say its even ready to use as a starting base. Its potentially misleading...

AminHP commented 4 years ago

Well, as I mentioned earlier, this library is not supposed to be a complete framework with 10 or 20 different _calculate_reward functions. The function is just a sample that shows how you can calculate reward. Writing a proper reward function or extracting good signal features is something that users should take care of.

However, about a year ago, I managed to get good results with this environment and a simple DQN agent. So, I can't agree with you on "this library needs quite a bit of work before ...". This library works as is, in case you know how to train an RL agent properly.