tensortrade-org / tensortrade

An open source reinforcement learning framework for training, evaluating, and deploying robust trading agents.
https://discord.gg/ZZ7BGWh
Apache License 2.0
4.52k stars 1.02k forks source link

use tensorboard to monitor training/testing #300

Open AlexQuant62 opened 3 years ago

AlexQuant62 commented 3 years ago

subj. instead of TT renderers. if using Ray, implement custom callback class. see issue/example here: https://github.com/ray-project/ray/issues/7871

rhamnett commented 3 years ago

Please provide a specific TT example (use the ray tutorial as a base if you require) https://www.tensortrade.org/en/latest/tutorials/ray.html

AlexQuant62 commented 3 years ago

I created a notebook: https://github.com/AlexQuant62/test01/blob/main/ray_with_callbacks.ipynb

Class definition for callbacks from: https://github.com/ray-project/ray/blob/0c80efa2a37f482494fbffbe9e81f61586b03ecb/rllib/examples/custom_metrics_and_callbacks.py

rhamnett commented 3 years ago

@AlexQuant62 Hi this looks like some code I wrote a while back. Did you find this somewhere? Just curious as I have got a more comprehensive version I can share.

AlexQuant62 commented 3 years ago

@rhamnett Hi, no didn't find your code. Please, share. I'm more than less novice in Python ;)

rhamnett commented 3 years ago

Will do, just waiting for a bug that I reported to be fixed in Ray :)

carlogrisetti commented 3 years ago

@rhamnett can you link the issue in Ray?

carlogrisetti commented 2 years ago

@AlexQuant62 did you have a look at the most recent example? https://www.tensortrade.org/en/latest/examples/train_and_evaluate_using_ray.html

However the thing you are suggesting is one of the things I would want to implement in the future, having a "balance" or "net_worth" value logged in tensorboard, that shows the "human" outcome of the training. This can be useful in training, but can be even more useful in evaluation. As an added bonus, to leverage tensortrade hyperparameter optimization tab (HPARAMS should be the title if I'm not mistaken), I would like to log the maximum reward value it reached (going on the idea that you have that checkpoint saved, and you can use that)

carlogrisetti commented 2 years ago

Couldn't manage to get some sort of custom metric in the evaluation stage. It can be done but during the training itself, not during evaluation, since there's no "on_evaluation_end" or similar callback in Ray.

Will try to look into that a little bit more