llSourcell / Q-Learning-for-Trading

327 stars 178 forks source link

More Fatures #2

Open yard91 opened 6 years ago

yard91 commented 6 years ago

Anyone a hint how i can add more features then just the close price? i am not sure about the prediction values? How does that work?

Thanks

S0AndS0 commented 5 years ago

The get_data function within utils.py reads in the raw csv files found in the data/ directory...

Snip start: utils.py

def get_data(col = 'close'):
    """ Returns a 3 x n_step array """
    msft = pd.read_csv('data/daily_MSFT.csv', usecols = [col])
    ibm = pd.read_csv('data/daily_IBM.csv', usecols = [col])
    qcom = pd.read_csv('data/daily_QCOM.csv', usecols = [col])
    # recent price are at top; reverse it
    return np.array([
        msft[col].values[::-1],
        ibm[col].values[::-1],
        qcom[col].values[::-1],
    ])

Snip end: utils.py

... usage example for get_data is within the run.py...

Snip start: run.py

    data = np.around(get_data())
    train_data = data[:, :3526]
    test_data = data[:, 3526:]

Snip end: run.py

... initially the train_data variable used for initializing TradingEnv class within envs.py, however if --mode test was passed then test_data is used instead

Snip start: run.py

    env = TradingEnv(train_data, args.initial_invest)
    state_size = env.observation_space.shape
    action_size = env.action_space.n
    agent = DQNAgent(state_size, action_size)
    scaler = get_scaler(env)

    portfolio_value = []

    if args.mode == 'test':
        # remake the env with test data
        env = TradingEnv(test_data, args.initial_invest)

Snip end: run.py

... I took a brief gander a bit deeper into the source and it looks as though TradingEnv (within envs.py) may not care if it is fed a 3 x n_step array or n x n_step array, so the following untested code maybe of some use in feeding different csvs ...

Snip start: utils.py

def get_data(
        col = 'close',
        file_path_list = [
            'data/daily_MSFT.csv',
            'data/daily_IBM.csv',
            'data/daily_QCOM.csv'
        ]
):
    """
    Returns a `len(file_path_list)` x `n_step` numpy array
    """
    data_list = []
    for path in file_path_list:
        if not os.path.isfile(path):
            print("Ignoring -> {0}".format(path))
            continue

        data = pd.read_csv(path, usecols = [col])
        # recent price are at top; reverse it
        data_list.append(data[col].values[::-1])

    return np.array(data_list)

Snip end: utils.py

... with those mods one should be able to modify the run.py script ...

Snip start: run.py

    # data = np.around(get_data())
    ## Above line is for context as to what bellow replaces
    ##  populate `file_path_list` array with your choice of file paths
    data = np.around(get_data(file_path_list = [
        'data/daily_MSFT.csv',
        'data/daily_IBM.csv',
        'data/daily_QCOM.csv'
    ]))

Snip end: run.py

At the very least this allows for control over inputted files, which have the following format ...


Bugs, dead-ends, and some ramblings coming up next, stop coping code examples here if not up to striking your own trail and squashing'em bugs.


Snip start: daily_QCOM.csv

timestamp,open,high,low,close,volume
2017-12-27,64.3200,64.5900,64.1800,64.5200,2378687
2017-12-26,64.4900,64.9400,64.2000,64.3000,4203624
2017-12-22,64.3100,64.9800,64.3000,64.7300,4386678

Snip end: daily_QCOM.csv

... which in a human reader friendly format looks like ...

timestamp  | open    | high    | low     | close   | volume
===========|=========|=========|=========|=========|========
2017-12-27 | 64.3200 | 64.5900 | 64.1800 | 64.5200 | 2378687
2017-12-26 | 64.4900 | 64.9400 | 64.2000 | 64.3000 | 4203624
2017-12-22 | 64.3100 | 64.9800 | 64.3000 | 64.7300 | 4386678

the column names are at the top, timestamp,..., which is what the col argument within the get_data function is passing through panda.read_csv as a list for the usecols argument.

Considering that usecols isn't usecol I believe it be possible to send a list of columns, eg ['close', 'volume'], with just a few more edits to the get_data function ...

Snip start: utils.py

def get_data(
        col = 'close',
        file_path_list = [
            'data/daily_MSFT.csv',
            'data/daily_IBM.csv',
            'data/daily_QCOM.csv'
        ]
):
    """
    Returns a `len(file_path_list)` x `n_step` numpy array

    Note `col` maybe a `str` or `list` type to choose a single or
    multiple columns to read from `file_path_list` files
    """
    # Assume `str` type was passed for `col` and correct thy self if it was a `list` type
    usable_columns = [col]
    if isinstance(col, list):
        usable_columns = col

    data_list = []
    for path in file_path_list:
        if not os.path.isfile(path):
            continue

        values = []
        # recent price are at top; reverse order of each column
        data = pd.read_csv(path, usecols = usable_columns)
        for column in data.keys():
            # This _feels_ wrong, `extend`ing, and it maybe
            #  that `append` and then some other magics should
            #  be used to flatten data_list prior to
            #  converting to numpy array
            values.extend(data[column].values[::-1])

        data_list += [values]

    return np.array(data_list)

Snip end: utils.py

... and a minor edit to the run.py script one should be able to use multiple columns ...

Snip start: run.py

    data = np.around(get_data(
        col = ['close', 'volume'],
        file_path_list = [
            'data/daily_MSFT.csv',
            'data/daily_IBM.csv',
            'data/daily_QCOM.csv'
        ]
    ))

Snip end: run.py

However, to get these last edits to work properly it does look as though the TradingEnv class within envs.py may need some modification as looks hard-coded for train_data to be of a certain shape...

Snip start: envs.py

def __init__(self, train_data, init_invest=20000):
        # data
        self.stock_price_history = np.around(train_data)
        self.n_stock, self.n_step = self.stock_price_history.shape
        ## ... other __init__ stuff .... ##
        # action space
        self.action_space = spaces.Discrete(3**self.n_stock)

        # observation space: give estimates in order to sample and build scaler
        stock_max_price = self.stock_price_history.max(axis=1)
        stock_range = [[0, init_invest * 2 // mx] for mx in stock_max_price]
        price_range = [[0, mx] for mx in stock_max_price]
        cash_in_hand_range = [[0, init_invest * 2]]
        nvec = stock_range + price_range + cash_in_hand_range
        self.observation_space = spaces.MultiDiscrete(nvec)

Snip end: envs.py

.... and it looks like the _reset, _step, methods within envs.py may also need fiddled with to pull out the expected data from self.stock_price_history

I tested with data = np.around(get_data(col = ['close', 'volume'])) and without and it looks like things are parsing the same, (eg putting print("self.stock_price -> {0}".format(self.stock_price)) into the _step method in either case shows the same data), but I am not 100% confident that it is functioning as intended and considering both the close and volume columns when directed.

That it still operates with or without multiple columns being considered leads me to believe that I've missed something, so please do correct me if I've gone astray somewhere in these ramblings.

As to the prediction values... that question is still open, I'm not that clever... yet ;-)

Updates start


I was partially wrong in those last edits, the reason why it chugged along was because...

Snip start: run.py

    train_data = data[:, :3526]
    test_data = data[:, 3526:]

... within run.py was happily slicing things down such that down-stream functions didn't gag, those lines'll need a little modification to pass on additional data while retaining it's edge for slicing files...

    train_data = data[..., :3526]
    test_data = data[..., 3526:]

Snip end: run.py

... also I was partially correct in that extending felt wrong in the last edits to get_data within utils.py, should have used append...

Snip start: utils.py

def get_data(
        cols = ['close'],
        file_path_list = [
            'data/daily_MSFT.csv',
            'data/daily_IBM.csv',
            'data/daily_QCOM.csv'
        ]
):
    file_data_list = []
    for path in file_path_list:
        if not os.path.isfile(path):
            print("Ignoring non-existent file -> {0}".format(path))
            continue

        file_data_dict = pd.read_csv(path, usecols = cols)

        values = []
        for col in file_data_dict.keys():
            # recent price and other stuff are at top;
            # reversing it for following `for` loops
            values.append(file_data_dict.pop(col).values[::-1])

        file_data_list.append(values)

    return np.array(file_data_list)

Snip end: utils.py

... but, BIG BUT, using above will then require editing TradingEnv within envs.py...

Snip start: envs.py

    def __init__(self, train_data, init_invest=20000):
        # data
        self.stock_price_history = np.around(train_data)  # round up to integer to reduce state space
        self.n_stock, self.n_step = self.stock_price_history.shape
        ## ...other __init__ stuff...
        self.observation_space = spaces.MultiDiscrete(stock_range + price_range + cash_in_hand_range)

... to look more like...

        # data
        if len(train_data.shape) == 2:
            self.stock_price_history = np.around(train_data)  # round up to integer to reduce state space
        elif len(train_data.shape) == 3:
            # Assume first of every stock to track is the closing price
            self.stock_price_history = np.around(train_data[:, 0])  # round up to integer to reduce state space

        self.n_cols = 0
        if len(train_data.shape) == 2:
            self.n_stock, self.n_step = self.stock_price_history.shape
        elif len(train_data.shape) == 3:
            self.n_stock, self.n_cols, self.n_step = train_data.shape
        ## ...other __init__ stuff...
        nvec = stock_range + price_range + cash_in_hand_range
        ## loop over inputs other than the first column if available
        nvec = stock_range + price_range + cash_in_hand_range
        if self.n_cols > 1:
            extra_data = []
            for v in train_data[..., 1:]:
                extra_data.append([[0, e] for e in v])
            nvec.extend(extra_data)

        self.observation_space = spaces.MultiDiscrete(nvec)

Might be pedantic, but it looks like _reset within envs.py will need self.n_cols = 0 prepended before the return self._get-obs() line

Snip end: envs.py

Next to have extra data requires tracking down why a ValueError: Error when checking input: expected dense_1_input to have shape (10,) but got array with shape (7,) is thrown whenever more than one data column is considered.

Personally these last edits all feel wrong and bodged, I think it might be better to use the first set of edits that allows for considering different stock price files, then running n number of envs with their own DONAgents considering a column each, these would have to be scaled-back to just predictions of each column's change, and then atop that would go the buy/sell logic.

That level of modification is likely best as a fork if not an entirely separate project, but hopefully these partially dead-ends where somewhat helpful for those also dissecting similar code.


Updates end