Closed Alex2782 closed 11 months ago
The warning for flattened observations can be ignored, more of a best practices for RL algorithms thing rather than an actual requirement.
Do you know what options are passed to reset()? If possible it’s best to change that for future compatibility, as the message says. Seeding can be done via the reset(seed) argument (directly calling seed() is deprecated)
In the 2nd part I tried "reinforce"-example with 'gym-anytrading' and added more changes.
https://gymnasium.farama.org/tutorials/training_agents/reinforce_invpend_gym_v26/
In the next days I try to insert more examples and check more precisely some calculations 'max_possible_profit' / 'avg_profit'. 'total_reward' is often positive but 'profit' is less than 1 (profit loss?)
I try to take 'observation' and 'rewards' functions from 'gym-mtsim', with 'SB3' the models are not trained properly.
The 'evaluate' and 'predict' functions always return the same value (all the time only 0 or 1) I also tried in a virtual environment, original versions without 'gymnasium' support, the results were identical.
https://github.com/DLR-RM/stable-baselines3/pull/1327 (+ 'gymnasium' support)
I could no longer successfully install 'stable_baselines' (as in the example https://github.com/AminHP/gym-anytrading/blob/master/examples/a2c_quantstats.ipynb) under macOS 13 for comparison.
https://github.com/Alex2782/gym-anytrading/blob/master/examples/SB3_quantstats.ipynb
'observation_space' is now initiated with 'INF = 1e10', like in the 'gym-mtsim' project (no longer with 'np.inf') gym_anytrading/envs/trading_env.py
'SB3'-predict: A2C looks better now
https://github.com/Alex2782/gym-anytrading/blob/master/examples/SB3_quantstats.ipynb
Cool to see you got it working!
new example: train_SB3_gymnasium.py
Avg. Rewards, Supported SB3 Agents x learning_timesteps = [5K, 10K, 25K], Comparison with "Random actions"
Hi guys. Apologize for my delayed participation in this PR. Thanks @Alex2782 for the PR, the changes are neat and concise. The additional examples are also quite useful, the project needed more examples with newer versions of stable-baselines.
Please let me know whenever you have finished. I'm looking forward to reviewing and testing it.
Hi @AminHP,
3 weeks ago I could train successfully with "REINFORCE": https://github.com/Alex2782/gym-anytrading/blob/master/examples/train_REINFORCE.py
"stable release" for SB3 with "gymnasium" support has not been released yet. https://github.com/DLR-RM/stable-baselines3/pull/1327#issuecomment-1508626600
Hi @Alex2782 , it seems they have just released a new version of stable-baselines3 with Gymnasium support. Can we proceed based on that?
Had you seen this
https://github.com/AminHP/gym-anytrading/issues/87 That’s where I got sample code from, not sure is this is the same SB3 you talking about though.
Hope that helps
Kind regards
On Sun, 25 Jun 2023 at 19:31, Elliot Tower @.***> wrote:
For what it’s worth we would love to have some sample code using gymnasium and SB3 for making a tutorial, if you can get it working let me know (I haven’t had time to mess with it myself)
— Reply to this email directly, view it on GitHub https://github.com/AminHP/gym-anytrading/pull/86#issuecomment-1606174015, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA7JXSYECC5XR4HGEZEJZKTXNBYV3ANCNFSM6AAAAAAWALMSZA . You are receiving this because you are subscribed to this thread.Message ID: @.***>
--
Kudakwashe Mafutah [image: https://]about.me/kmafutah https://about.me/kmafutah?promo=email_sig&utm_source=product&utm_medium=email_sig&utm_campaign=edit_panel&utm_content=plaintext
Makes sense, I realized they actually updated their entire documentation website so now everything is gymnasium, therefore we can just link to that
On Sun, Jun 25, 2023 at 5:58 PM Kudakwashe Mafutah @.***> wrote:
Had you seen this
https://github.com/AminHP/gym-anytrading/issues/87 That’s where I got sample code from, not sure is this is the same SB3 you talking about though.
Hope that helps
Kind regards
On Sun, 25 Jun 2023 at 19:31, Elliot Tower @.***> wrote:
For what it’s worth we would love to have some sample code using gymnasium and SB3 for making a tutorial, if you can get it working let me know (I haven’t had time to mess with it myself)
— Reply to this email directly, view it on GitHub < https://github.com/AminHP/gym-anytrading/pull/86#issuecomment-1606174015>,
or unsubscribe < https://github.com/notifications/unsubscribe-auth/AA7JXSYECC5XR4HGEZEJZKTXNBYV3ANCNFSM6AAAAAAWALMSZA>
. You are receiving this because you are subscribed to this thread.Message ID: @.***>
--
Kudakwashe Mafutah [image: https://]about.me/kmafutah < https://about.me/kmafutah?promo=email_sig&utm_source=product&utm_medium=email_sig&utm_campaign=edit_panel&utm_content=plaintext>
— Reply to this email directly, view it on GitHub https://github.com/AminHP/gym-anytrading/pull/86#issuecomment-1606275983, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHVPVAZ52WQZWPKYVEWBWF3XNCX7BANCNFSM6AAAAAAWALMSZA . You are receiving this because you commented.Message ID: @.***>
Hi @AminHP,
maybe next week or in 2 weeks I have more time. Should I better create a new PULL request with fewer examples?
Hi @AminHP
I have tried SB3 release version: pip install 'stable-baselines3[extra]'
A2C and PPO work so far, after 25K learning steps they reach more rewards than "Random Actions"
train_sb3_gymnasium.py
Hi @Alex2782 , great job! Thanks for the update. I will review it soon and we will discuss it.
I have tested all the examples and everything works fine. Thanks again for your contribution and effort.
About the examples, I agree with you. I also think it is better to use jupyter notebook for the examples instead of python scripts, so the users would comprehend the examples more easily. Besides, we can merge the presented codes in this PR and provide two examples: a2c_quantstats.ipynb
(which is the updated version of my code as you already have implemented in SB3_quantstats.ipynb), and reinforce.ipynb
(which trains and evaluates multiple RL algorithms). Note that it seems the random_actions
example is not necessary as it is available in the README file.
What's your opinion @Alex2782 ?
Hi @AminHP,
i changed to 2d shape and removed _observation_cache
.
4 months ago I first trained the changes with REINFORCE
example. The example did not work with 2d shape.
https://gymnasium.farama.org/tutorials/training_agents/reinforce_invpend_gym_v26/
random_actions
example and REINFORCE
example I have also removed.
What exactly I did to the "ipynb" files, I unfortunately don't remember, after execution the file was completely overwritten.
@AminHP
the render
function has been revised
https://gymnasium.farama.org/content/migration-guide/#environment-render
I am not sure if more changes are necessary.
In the project 'gym-anytrading' it can still be executed if 'render_mode' is configured correctly.
In the project 'gym-mtsim' it was not compatible at all yet https://github.com/AminHP/gym-mtsim/pull/40
I have tested the examples and everything works fine. Thanks @Alex2782 for patiently addressing and resolving the matters. Congratulations on your first contribution to this repo!
I will update the docs and make some minor changes later.
I haven't tested everything yet: 'stocks-v0', render_mode="human" works already.
There are still 2 warnings: