kundtx / lfd2022-comments

0 stars 0 forks source link

Learning from Data (Fall 2022) #29

Open kundtx opened 1 year ago

kundtx commented 1 year ago

http://8.129.175.102/lfd2022fall-poster-session/19.html

RyanGreen11 commented 1 year ago

G8 Ai Ran: I have found that there are different reinforcement learning agents(DDPG, TD3 and SAC). So which one can make the best profit for the users( in other words, earn more money)?

NineAbyss commented 1 year ago

@RyanGreen11 G8 Ai Ran: I have found that there are different reinforcement learning agents(DDPG, TD3 and SAC). So which one can make the best profit for the users( in other words, earn more money)?

G19 Peisong Wang: Thank you for your question! In our experiments, SAC earned the most money :) In additon, its Sharpe ratio is also the highest, which means it can balance the returns and risks well. However, its max drawdown is a little bit higher than DDPG. (In practition, sometimes there will be a threshold to control the maxdrawdown, if the drawdown exceeds this fixed threshold, the holded shares will be sold to avoid further loss.)

RyanGreen11 commented 1 year ago

@NineAbyss

@RyanGreen11 G8 Ai Ran: I have found that there are different reinforcement learning agents(DDPG, TD3 and SAC). So which one can make the best profit for the users( in other words, earn more money)?

G19 Peisong Wang: Thank you for your question! In our experiments, SAC earned the most money :) In additon, its Sharpe ratio is also the highest, which means it can balance the returns and risks well. However, its max drawdown is a little bit higher than DDPG. (In practition, sometimes there will be a threshold to control the maxdrawdown, if the drawdown exceeds this fixed threshold, the holded shares will be sold to avoid further loss.)

G8 Ai Ran: Okay, I got it, thank you!

uprightman47 commented 1 year ago

G3 Siqi Chen: I think it is a very wonderful work! Could you explain more about the three plots in the right column? What do the colored blocks in these pictures mean?

NineAbyss commented 1 year ago

@uprightman47 G3 Siqi Chen: I think it is a very wonderful work! Could you explain more about the three plots in the right column? What do the colored blocks in these pictures mean?

Group19 Peisong Wang: Thank you for your question:) The figure on the right side shows top 5 drawdowns during the backtest period. The width of colored block shows the time range of this drawdown. Besides, darker the color is, larger the drawdown is. We can get some interesting information from these 3 figures. For example, DDPG agent goes through the biggest crisis faster. I hope my explanation could help you!

ErlindaQiao commented 1 year ago

G3 Xizi Qiao : Seems like an interesting work :D. May I ask what are the meanings of DDPG, TD3 and SAC separately ? And what's the main difference of these three algorithms?

uprightman47 commented 1 year ago

@NineAbyss

@uprightman47 G3 Siqi Chen: I think it is a very wonderful work! Could you explain more about the three plots in the right column? What do the colored blocks in these pictures mean?

Group19 Peisong Wang: Thank you for your question:) The figure on the right side shows top 5 drawdowns during the backtest period. The width of colored block shows the time range of this drawdown. Besides, darker the color is, larger the drawdown is. We can get some interesting information from these 3 figures. For example, DDPG agent goes through the biggest crisis faster. I hope my explanation could help you!

Thank you, I have got it! Your explanation is very clear (:D)∠).

NineAbyss commented 1 year ago

@ErlindaQiao G3 Xizi Qiao : Seems like an interesting work :D. May I ask what are the meanings of DDPG, TD3 and SAC separately ? And what's the main difference of these three algorithms?

Group19 Peisong Wang: Thank you for your question! These are three advanced RL algorithms. Sorry for not explaning them in the poster, because there is not enough space. DDPG stands for Deep Deterministic Policy Gradient, TD3 stands for Twin Delayed DDPG, SAC stands for Soft Actor-Critic. The details of these 3 algorithms may not be introduced here, you can check them in https://spinningup.openai.com/en/latest/index.html. I can simply talk about the difference among these 3 algorithms here. DDPG is expanded from DQN, we can consider it as the continuous editon of DQN. It is very sensitive about hyperparameters. TD3 use several methods to handle this problem. SAC introduces entropy as regularization. SAC uses random policy while DDPG uses deterministic policy.