Closed tinof closed 3 years ago
So I guess for now Tensorflow <2.0 and Python =<3.7 is required.
Yes, stable baselines only supports tf 1.X for now, there are some tf 2 forks, but I'm not sure if there is something far enough yet.
The stable-baselines library currently supports Tensorflow versions from 1.8.0 to 1.15.0, and does not work on Tensorflow versions 2.0.0 and above. Source
Use this command to install tensorflow 1.15.
pip install --upgrade tensorflow==1.15.0
Stable baseline now supports TF2: https://github.com/DLR-RM/stable-baselines3
@tinof Stable baseline now supports TF2: https://github.com/DLR-RM/stable-baselines3
I am afraid that is a PyTorch version instead of TF2.
And last time I checked.
That version is in early stage, so it lack of several major features like multi processing.
@tinof
Stable baseline now supports TF2: https://github.com/DLR-RM/stable-baselines3
I am afraid that is a PyTorch version instead of TF2. And last time I checked. That version is in early stage, so it lack of several major features like multi processing.
Yes, you are right. I was able to run freqtrade gym with stable-baselines3 after some slight modification. Btw, awesome project. To be honest, I am still kinda new in openai, gym, stable_baselines, ray, etc.
As I am still kinda new in reinforcement learning, I would need some insight/advices from the expert. Can you enlighten me on this portion of code in freqtradegym.py?
Why self._reward being set to zero at the beginning of the function and why it has to be reset to zero when self._reward > 1.5?
By the way, I would like to implement sortino / omega as rewarding scheme instead of profit only rewarding scheme. Anyone can kindly highlight which part of the code should I dive into? It is observation( ) function in freqtradegym.py?
Why self._reward being set to zero at the beginning of the function and why it has to be reset to zero when self._reward > 1.5?
Ah, I forgot why I did that. Maybe just trying something out. This is an experimental project. Feel free to remove it and do more experiments.
By the way, I would like to implement sortino / omega as rewarding scheme instead of profit only rewarding scheme. Anyone can kindly highlight which part of the code should I dive into? It is observation( ) function in freqtradegym.py?
You can check out here.
Yes, you are right. I was able to run freqtrade gym with stable-baselines3 after some slight modification. Btw, awesome project. To be honest, I am still kinda new in openai, gym, stable_baselines, ray, etc.
Man, how did you do that? I'm trying to run with stable-baselines3, but i've some problems with libraries, like ACER, maybe you can share your code for gym, strategy and deep_rl?
What an interesting project, thanks for this. I've tried to run the demo to train an agent, but got this error:
Is it related to this?
`tensorflow.contrib is being removed in version 2.0``
If so, I could not fix it with this: https://github.com/deetungsten/stable-baselines
Any other ideas? Should I just revert to Python 3.7 and Tensorflow <2.x ?