state_shape with only one candle (1,4)

AdrianP- commented 7 years ago

If you think in algo-trader terms, the normal is taking an action every time that you get a new candle, so the Agent should do the same. But btgym isn't prepared for state_shape=(1, 4) There is a reason for that?

Kismuz commented 7 years ago

@AdrianP- , State shape is unrelated to frequency of agent taking action. It is just time-embedding dimension, like frame-stacking in atari domain, meant to help with solving temporal-dependant POMDP. Shape (1,4) means one candle, shape (10,4) - ten last candles as an observation.

There is another control for agent exactly telling how many observations to skip until taking another action (in-between assumed to be hold). I's described in source code, line 159 in btgym/btgym/backtrader.py:

            skip_frame=None,
                # Number of environment steps to skip before returning next response,
                # e.g. if set to 10 -- agent will interact with environment every 10th episode step;
                # Every other step agent's action is assumed to be 'hold'.
                # Note: INFO part of environment response is a list of all skipped frame's info's,
                #       i.e. [info[-9], info[-8], ..., info[0].
        )

It should be in documentation, but I just don't have time to write it properly. I'll try to fix it as soon as I can.

btgym/examples/setting_up_environment_full.ipynb has an exampe with skipped-frames.

AdrianP- commented 7 years ago

Sorry, I explained wrong. I get your point but,as far as my knowledge, the new observations that you get each step should be only one. That is the strategy on OpenAI baseline, because every new action save in a ReplayBuffer.

Nevertheless, the patch is trivial, so I'm going to do for my version :)

Kismuz commented 7 years ago

Ok!

Kismuz / btgym

state_shape with only one candle (1,4) #6