AminHP / gym-anytrading

The most simple, flexible, and comprehensive OpenAI Gym trading environment (Approved by OpenAI Gym)
MIT License
2.1k stars 465 forks source link

Clarifications: Confused on how to implement in production(live) #31

Closed jjphung closed 3 years ago

jjphung commented 3 years ago

@AminHP I am having a hard time wrapping my head around on how to implement this in a live environment for paper trading, just to hook everything up E2E.

env_maker = lambda: gym.make(
    'stocks-v0',
    df=test_df,
    window_size=window_size,
    frame_bound=(start_index, end_index)
)

The above snippet is how to create an environment for the agent/model to step through. But in order to create the environment, we have to pass in a DataFrame. In the real world, we won't know the current day OHCLV until the markets close. So how would we be able to use a trained model in a current environment with up to date data and features (observations)? Unless it's predicted actions are actually for the next day?

Side question: Why on observation = observation[np.newaxis, ...] while stepping through the env do we have to reduce observation by 1 dimension before predicting? I don't think observations (signal_features) is changing in the environment.

Thank you!

AminHP commented 3 years ago

Why do you say: "we won't know the current day OHCLV until the markets close"? Many platforms give real-time OHCLV such as MetaTrader.

Side question: This is something for the stable_baselines to work. It's not a part of our algorithm.

jjphung commented 3 years ago

I say this because the training data is daily. Daily stock data is finalized and reported end of day. I understand that if we worked with consistent intraday data training intervals, but in this case we are working with dailies. And dailies are what I am interested in as opposed to other posts on day trading.

Edit: If this is the case, we could train and test on daily OHCLV data, but perhaps gather intraday observations to create current environments so the model can trade on it. I wonder if it would translate well (make sense) though: train and test on daily, but live trade on intraday OHCLV observations.

AminHP commented 3 years ago

At the end of the day, the agent gets data from previous days to today, then it decides to buy or sell tomorrow. The agent learns to act according to its given information. So, on average, it makes the best decision for tomorrow by analyzing the previous days' data.

About the "I wonder if it would ...", I think it is a good idea. But, it may be better to divide days into several timestamps, then for each timestamp feed previous days' data to the agent alongside the timestamp. In this way, I think the agent will know that for what time of tomorrow it is making a decision, but I'm not so sure.

jjphung commented 3 years ago

Thank you for the explanation and insights! :) Things are clearer now.