AminHP / gym-anytrading

The most simple, flexible, and comprehensive OpenAI Gym trading environment (Approved by OpenAI Gym)
MIT License
2.1k stars 465 forks source link

Extracting results for quantstats #18

Closed sword134 closed 4 years ago

sword134 commented 4 years ago

Hello, after creating a model and running results = model.learn(int(1000)) how do I use the results to compare with a benchmark in quantstats? Currently the results doesn't hold the data that quantstats expect to be able to use in conjunction with qs.reports.html(results, "SPY", output="D:\ReinforcementLearning\BaseLines\Trading\Myreport.html")

AminHP commented 4 years ago

Hi, can you give a complete example with source code?

sword134 commented 4 years ago

There really isnt anything to it other than using the default "stocks-v0" env and then passing it through a DQN as show below:

env = gym.make('stocks-v0', frame_bound=(50, 100), window_size=10)
model = DQN('MlpPolicy', env, tensorboard_log="D:\ReinforcementLearning\BaseLines\Trading\Tensorboard",
            verbose=2)
AminHP commented 4 years ago

model.learn returns a DQN object while qs.reports.html gets a Pandas Series object as its first argument. So, I don't think they are related.

Unfortunately, I'm not familiar with quantstats. I tried to read their library and its source code but their full documentation is not available yet and I couldn't understand the source code as well. Maybe it is a better idea to make an issue in their repo and ask your question there.

sword134 commented 4 years ago

Well there is a RL library called tensortrade. In that I could get it to work with quantstats, the library is also built on top of stable baselines. I am going to post the code of my tensortrade network and here it actually works with quantstats.

from datetime import datetime 
import pandas as pd
import ssl
import quantstats as qs
from pathlib import Path
import numpy as np

from tensortrade.data import Module, Stream
from tensortrade.data import DataFeed

from tensortrade.exchanges import Exchange, ExchangeOptions
from tensortrade.exchanges.services.execution.simulated import execute_order

# from tensortrade.agents.parallel.parallel_dqn_agent import ParallelDQNAgent

from tensortrade.actions import SimpleOrders
from tensortrade.agents import A2CAgent, DQNAgent
from tensortrade.environments import TradingEnvironment
from tensortrade.instruments import USD, BTC, Instrument, AAPL
from tensortrade.wallets import Portfolio, Wallet
from tensortrade.rewards import SimpleProfit

ssl._create_default_https_context = ssl._create_unverified_context

USDT = Instrument("USDT", 2, "Tether")

file_path = "D:\ReinforcementLearning\Tensortrade\Combined DIAGLDDAX\DIAGLDDAX_features_daily.csv"

def fetch_candles(exchange_name, symbol, timeframe, add_TA=False):

    df = pd.read_csv(file_path, skiprows=0)
    df = df[::-1]  # Flip list order
    df = df.set_index("timestamp")

    if add_TA and False:
        import ta

        ta.add_all_ta_features(
            df, "Open", "High", "Low", "Close", "Volume", fillna=True
        )
    return df

data = fetch_candles("Binance", "BTC", "d", add_TA=True)  # 1h also available
print(data)

with Module("binance") as node_stream:
    nodes = []
    for name in data.columns:
        nodes.append(Stream(list(data[name]), name))

data_feed = DataFeed([node_stream])

exchange = Exchange("binance", service=execute_order, options=ExchangeOptions(commission=0.0001))(
    Stream(list(data["close"]), "USD-BTC")
)

startingcash = 1000000
portfolio = Portfolio(
    base_instrument=USD,
    wallets=[Wallet(exchange, startingcash * USD), Wallet(exchange, 0 * BTC)],
)

env = TradingEnvironment(
    feed=data_feed,
    portfolio=portfolio,
    action_scheme=SimpleOrders(),
    reward_scheme=SimpleProfit(),  # This should work
    window_size=20,  # Doesnt have any actaully, its all old shit
)

save_path = "D:\ReinforcementLearning\Tensortrade\\agents"
Path(save_path).mkdir(parents=True, exist_ok=True)
print(datetime.now().strftime("%Y-%m-%d %H:%M:%S %p"))
agent = A2CAgent(env)

reward = agent.train(n_steps=10000,
                     n_episodes=1, save_path=save_path)

print(reward)

if isinstance(portfolio.performance.net_worth, (pd.DataFrame, pd.Series)):
    returns = portfolio.performance.net_worth.pct_change().iloc[1:]
else:
    # Assume np.ndarray
    returns = np.diff(portfolio.performance.net_worth, axis=0)
    np.divide(returns, portfolio.performance.net_worth[:-1], out=returns)

dates = data.reset_index()["timestamp"][: len(returns)].apply(
    lambda x: pd.to_datetime(x, format="%Y-%m-%d")
)
returns = pd.concat(
    [dates, returns.reset_index()["net_worth"]], axis=1
)  # add date index to net worth values
returns = returns[returns.timestamp.notnull()][1:]
returns = returns.set_index("timestamp")

qs.reports.full(returns["net_worth"])
qs.reports.html(returns["net_worth"], "SPY",
                output="D:\ReinforcementLearning\Tensortrade\Myreport.html")
AminHP commented 4 years ago

Thank you. I will try to work on it but it may take a while.

sword134 commented 4 years ago

@AminHP thanks alot. Making anytrading run along side a proper benchmark program like quantstats is really going to be a big plus for people deciding on what library to use for RL in trading!

AminHP commented 4 years ago

Yeah that's right! Thanks for noticing that.

sword134 commented 4 years ago

@AminHP not to be nagging, but do you have an estimate of how long it will take to implement?

AminHP commented 4 years ago

Honestly, I don't know anything about these two libraries. So, I can't estimate exactly; it may take a day or a week.

AminHP commented 4 years ago

Hi again. Can you try this code and check if it works?

# Imports
import numpy as np
import pandas as pd

import gym
import gym_anytrading
import quantstats as qs

from stable_baselines import A2C
from stable_baselines.common.vec_env import DummyVecEnv

import matplotlib.pyplot as plt

# Create Env
df = gym_anytrading.datasets.STOCKS_GOOGL.copy()
df.index = pd.to_datetime(df.index)
env_maker = lambda: gym.make('stocks-v0', df=df, window_size=10, frame_bound=(100, 5000))
env = DummyVecEnv([env_maker])

# Train Env
policy_kwargs = dict(net_arch=[64, 'lstm',dict(vf=[128, 128, 128], pi=[64, 64])])
model = A2C('MlpLstmPolicy', env, verbose=1, policy_kwargs=policy_kwargs)
model.learn(total_timesteps=1)

# Test Env 
env = env_maker()
observation = env.reset()

profits = []

while True:
    observation = observation[np.newaxis, ...]
    # action = env.action_space.sample()
    action, _states = model.predict(observation)

    observation, reward, done, info = env.step(action)
    profits.append(info['total_profit'])

    # env.render()
    if done:
        print("info:", info)
        break

# Plot Results
plt.cla()
env.render_all()
plt.show()

# Analysis Using quantstats
qs.extend_pandas()

net_worth = pd.Series(profits, index=df.index[-len(profits):])
returns = net_worth.pct_change().iloc[1:]

qs.reports.full(returns)
qs.reports.html(returns, output="report.html")
AminHP commented 4 years ago

I forgot to mention! @sword134

sword134 commented 4 years ago

@AminHP just ran the test code you posted. Damn you are an absolute champ. Going to implement this with my own my RL now and I just want to thank you a lot! Perhaps for even easier implementation it would be possible to get the returns variable through a profit function that could easily be built into gym-anytrading.

Thanks a lot!

AminHP commented 4 years ago

Thank you a lot, I'm glad you enjoyed it :)

Yes you are right, I will improve it and release a newer version in a few days. Also, I will add a sample code of mixing gym-anytrading with stable_baselines and quantstats.

Thanks for your help and your sample code.