Thanks for your TradingGym project which is really interesting and helpful.
I’m a bit unclear with two things in trading_env.py.
(1)Will code line [next_index = self.step_st+self.obs_len+1] in self.step function result in a blank trading day?
Suppose obs_len = 10 and step_len =5, the initial self.obs_res = self.obs_features[0:10], next_index = self.step_st+self.obs_len+1 =11, where is the 10th day info? Considering python list design exclude the last element. Is this a bug or I miss something?
(2)Would it be nicer if reward_ret value is a percent return rather than a absolute value?
Something like
self.reward_fluctuant = (self.price_currentself.position_share - self.transaction_details.iloc[-1]['price_mean']self.position_share - self.fee*abs_pos) / self.transaction_details.iloc[-1]['price_mean']
By the way, I notice that, every step return a reward which is actually a stock value (cumulative return) rather than a flow value (the interval return). I doubt which is reasonable. Would you mind explaining something about this, it would be really appreciated. Thanks for your code and time.
Hi Yvictor
Thanks for your TradingGym project which is really interesting and helpful.
I’m a bit unclear with two things in trading_env.py.
(1)Will code line [next_index = self.step_st+self.obs_len+1] in self.step function result in a blank trading day?
Suppose obs_len = 10 and step_len =5, the initial self.obs_res = self.obs_features[0:10], next_index = self.step_st+self.obs_len+1 =11, where is the 10th day info? Considering python list design exclude the last element. Is this a bug or I miss something?
(2)Would it be nicer if reward_ret value is a percent return rather than a absolute value? Something like self.reward_fluctuant = (self.price_currentself.position_share - self.transaction_details.iloc[-1]['price_mean']self.position_share - self.fee*abs_pos) / self.transaction_details.iloc[-1]['price_mean']
By the way, I notice that, every step return a reward which is actually a stock value (cumulative return) rather than a flow value (the interval return). I doubt which is reasonable. Would you mind explaining something about this, it would be really appreciated. Thanks for your code and time.
Have a good day.