Closed MorleyMinde closed 6 years ago
@MorleyMinde ,
ensure your agent is choosing actions other than 'hold'. If it doesn't open and close positions - reward will be zero. Simplest way to check it is to make randomly acting agent:
action = env.action_space.sample() # samples random action from action space
this is also useful for sanity check of your broker setup: randomly acting agent should be able to drain your trading account in at least ~2/3 of maximum episode duration; review 'info' part of the response for additional insights; See also: https://github.com/Kismuz/btgym/blob/master/examples/very_basic_env_setup.ipynb
It may be the case when actual sampled episode duration is too small due to inconsistent episode/dataset settings. Setting verbose=2 and reviewing sampling logs can help identify it.
I wrote simple code with the basic example as follows:
with open("log.txt", "a") as myfile:
env = BTgymEnv(filename='./btgym/examples/data/DAT_ASCII_EURUSD_M1_2016.csv')
done = False
o = env.reset()
while not done:
action = env.action_space.sample()
obs, reward, done, info = env.step(action)
myfile.write('action: {},reward: {},info: {}\n'.format(action, reward, info))
Here is the result I am getting (I cant post the entire log but here are the last couple of lines):
action: 0,reward: 0.0,info: [{'drawdown': 0.0, 'max_drawdown': 0.0, 'step': 1409, 'broker_cash': 10.0, 'broker_value': 10.0, 'time': datetime.datetime(2016, 12, 6, 5, 23), 'action': 'buy', 'broker_message': 'New BUY created; ORDER FAILED with status: Margin'}]
action: 3,reward: 0.0,info: [{'drawdown': 0.0, 'max_drawdown': 0.0, 'step': 1410, 'broker_cash': 10.0, 'broker_value': 10.0, 'time': datetime.datetime(2016, 12, 6, 5, 24), 'action': 'hold', 'broker_message': 'ORDER FAILED with status: Margin'}]
action: 1,reward: 0.0,info: [{'drawdown': 0.0, 'max_drawdown': 0.0, 'step': 1411, 'broker_cash': 10.0, 'broker_value': 10.0, 'time': datetime.datetime(2016, 12, 6, 5, 25), 'action': 'close', 'broker_message': 'New CLOSE created; -'}]
action: 3,reward: 0.0,info: [{'drawdown': 0.0, 'max_drawdown': 0.0, 'step': 1412, 'broker_cash': 10.0, 'broker_value': 10.0, 'time': datetime.datetime(2016, 12, 6, 5, 26), 'action': 'buy', 'broker_message': 'New BUY created; -'}]
action: 1,reward: 0.0,info: [{'drawdown': 0.0, 'max_drawdown': 0.0, 'step': 1413, 'broker_cash': 10.0, 'broker_value': 10.0, 'time': datetime.datetime(2016, 12, 6, 5, 27), 'action': 'close', 'broker_message': 'New CLOSE created; ORDER FAILED with status: Margin'}]
action: 2,reward: 0.0,info: [{'drawdown': 0.0, 'max_drawdown': 0.0, 'step': 1414, 'broker_cash': 10.0, 'broker_value': 10.0, 'time': datetime.datetime(2016, 12, 6, 5, 28), 'action': 'buy', 'broker_message': 'New BUY created; -'}]
action: 3,reward: 0.0,info: [{'drawdown': 0.0, 'max_drawdown': 0.0, 'step': 1415, 'broker_cash': 10.0, 'broker_value': 10.0, 'time': datetime.datetime(2016, 12, 6, 5, 29), 'action': 'sell', 'broker_message': 'New SELL created; ORDER FAILED with status: Margin'}]
action: 1,reward: 0.0,info: [{'drawdown': 0.0, 'max_drawdown': 0.0, 'step': 1416, 'broker_cash': 10.0, 'broker_value': 10.0, 'time': datetime.datetime(2016, 12, 6, 5, 30), 'action': 'close', 'broker_message': 'New CLOSE created; ORDER FAILED with status: Margin'}]
action: 3,reward: 0.0,info: [{'drawdown': 0.0, 'max_drawdown': 0.0, 'step': 1417, 'broker_cash': 10.0, 'broker_value': 10.0, 'time': datetime.datetime(2016, 12, 6, 5, 31), 'action': 'buy', 'broker_message': 'New BUY created; -'}]
action: 0,reward: 0.0,info: [{'drawdown': 0.0, 'max_drawdown': 0.0, 'step': 1418, 'broker_cash': 10.0, 'broker_value': 10.0, 'time': datetime.datetime(2016, 12, 6, 5, 32), 'action': 'close', 'broker_message': 'New CLOSE created; ORDER FAILED with status: Margin'}]
action: 0,reward: 0.0,info: [{'drawdown': 0.0, 'max_drawdown': 0.0, 'step': 1419, 'broker_cash': 10.0, 'broker_value': 10.0, 'time': datetime.datetime(2016, 12, 6, 5, 33), 'action': 'hold', 'broker_message': '-'}]
action: 3,reward: 0.0,info: [{'drawdown': 0.0, 'max_drawdown': 0.0, 'step': 1420, 'broker_cash': 10.0, 'broker_value': 10.0, 'time': datetime.datetime(2016, 12, 6, 5, 34), 'action': 'hold', 'broker_message': '-'}]
action: 3,reward: 0.0,info: [{'drawdown': 0.0, 'max_drawdown': 0.0, 'step': 1421, 'broker_cash': 10.0, 'broker_value': 10.0, 'time': datetime.datetime(2016, 12, 6, 5, 35), 'action': 'close', 'broker_message': 'New CLOSE created; -'}]
action: 3,reward: 0.0,info: [{'drawdown': 0.0, 'max_drawdown': 0.0, 'step': 1422, 'broker_cash': 10.0, 'broker_value': 10.0, 'time': datetime.datetime(2016, 12, 6, 5, 36), 'action': 'close', 'broker_message': 'New CLOSE created; -'}]
action: 3,reward: 0.0,info: [{'drawdown': 0.0, 'max_drawdown': 0.0, 'step': 1423, 'broker_cash': 10.0, 'broker_value': 10.0, 'time': datetime.datetime(2016, 12, 6, 5, 37), 'action': 'close', 'broker_message': 'New CLOSE created; -'}]
action: 1,reward: 0.0,info: [{'drawdown': 0.0, 'max_drawdown': 0.0, 'step': 1424, 'broker_cash': 10.0, 'broker_value': 10.0, 'time': datetime.datetime(2016, 12, 6, 5, 38), 'action': 'close', 'broker_message': 'New CLOSE created; -'}]
action: 2,reward: 0.0,info: [{'drawdown': 0.0, 'max_drawdown': 0.0, 'step': 1425, 'broker_cash': 10.0, 'broker_value': 10.0, 'time': datetime.datetime(2016, 12, 6, 5, 39), 'action': 'buy', 'broker_message': 'New BUY created; -'}]
action: 1,reward: 0.0,info: [{'drawdown': 0.0, 'max_drawdown': 0.0, 'step': 1426, 'broker_cash': 10.0, 'broker_value': 10.0, 'time': datetime.datetime(2016, 12, 6, 5, 40), 'action': 'sell', 'broker_message': 'New SELL created; ORDER FAILED with status: Margin'}]
action: 0,reward: 0.0,info: [{'drawdown': 0.0, 'max_drawdown': 0.0, 'step': 1427, 'broker_cash': 10.0, 'broker_value': 10.0, 'time': datetime.datetime(2016, 12, 6, 5, 41), 'action': 'buy', 'broker_message': 'New BUY created; ORDER FAILED with status: Margin'}]
action: 0,reward: 0.0,info: [{'drawdown': 0.0, 'max_drawdown': 0.0, 'step': 1428, 'broker_cash': 10.0, 'broker_value': 10.0, 'time': datetime.datetime(2016, 12, 6, 5, 42), 'action': 'hold', 'broker_message': 'ORDER FAILED with status: MarginEND OF DATA'}]
action: 1,reward: 0.0,info: [{'drawdown': 0.0, 'max_drawdown': 0.0, 'step': 1429, 'broker_cash': 10.0, 'broker_value': 10.0, 'time': datetime.datetime(2016, 12, 6, 5, 43), 'action': 'hold', 'broker_message': 'CLOSE, END OF DATA'}]
action: 0,reward: 0.0,info: [{'drawdown': 0.0, 'max_drawdown': 0.0, 'step': 1430, 'broker_cash': 10.0, 'broker_value': 10.0, 'time': datetime.datetime(2016, 12, 6, 5, 44), 'action': 'buy', 'broker_message': 'CLOSE, END OF DATA'}]
The reward is 0 through out regardless of the action and the cash is still 10.
Thanks
@MorleyMinde , please review broker messages:
'broker_message': 'New CLOSE created; ORDER FAILED with status: Margin'}
so it is:
In basic example it not set enough cash to perform any operations, no order can be executed. Look at other examples notebook with realistic account settings, like this:
MyCerebro.broker.setcash(2000)
MyCerebro.broker.setcommission(commission=0.0001, leverage=10.0) # commisssion to imitate spread
MyCerebro.addsizer(bt.sizers.SizerFix, stake=5000,)
Se also: #35 on broker account setting.
Should been mentioned it in initial reply; haven't noticed there is no changes to account, sorry.
Totally worked. Should go through the documentation carefully. Thanks alot.
Hello @Kismuz There is something I am experiencing with the environment where as at each step when taking an action on the environment the reward is always 0. I am not sure this is a bug or I am just missing something
Here is my initialization code
Here is how it is used
Expected behaviour:
I expect the reward to be varying in that it should be negative or positive or zero occationally.
Actual behaviour:
The reward is always 0 even when I put all the actions within one episode to be a single one.
Please help me out here
Thanks in advance.