ZhengyaoJiang / PGPortfolio

PGPortfolio: Policy Gradient Portfolio, the source code of "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem"(https://arxiv.org/pdf/1706.10059.pdf).
GNU General Public License v3.0
1.74k stars 750 forks source link

how to get time index from __get_matrix_X() #16

Closed srozov closed 6 years ago

srozov commented 6 years ago

This is not really an issue, but I'd like to find a way to reliably find the time indices of the values extracted from the database with self.get_submatrix(index) and __pack_samples(self, indexs). I understand that the time indices can be accessed with timeindices = self.__global_data.minor_axis, but how to do this inside the two previously mentioned methods? What I'm trying to do is basically find out to which time index each of these 'X' matrices belongs to:

    def __get_matrix_X(self):
        return self.__test_set["X"][self._steps]
dexhunter commented 6 years ago

Hi there!

So to have time index you need to give index to every mini-batch. I don't think our current implementation include that. You can add an condition for next_experience_batch function at replay_buffer.py.

Just curious, are you trying to have a sequential training? (something like an epoch)

ZhengyaoJiang commented 6 years ago

https://github.com/ZhengyaoJiang/PGPortfolio/blob/f0e74eac7ae127f2675ff24dadf21dedc5ad19e2/pgportfolio/marketdata/datamatrices.py#L160-L170

I think you can get it from minot_axis of globaldata here. For example:

t = self.__global_data.minor_axis[indexs]

I didn't test that, you can refine it if there are bugs.

srozov commented 6 years ago

Thanks a lot for your replies! I was trying to do real-time testing. I'm updating the db every 5 min and taking the returns of __get_matrix_X() and __get_matrix_y() as inputs to calculate the portfolio weights (instead of [self._steps] index I'm just using [-1]. I also modified __pack_samples(indexes) so that the M matrix has only the last two necessary entries to calculate y. I just wanted to make sure that it is always taking the most recent entry in the db. It looks like it's working :)

I have one more question: are the coins that are displayed in the log (as in Selected coins are: ['...'] ) in the same order as the portfolio weights ( as in the raw omega is [...]) ? There is one more entry in this one, I assume the first is the BTC weight, and the rest is in the same order as in the first log entry, right?

ZhengyaoJiang commented 6 years ago

There is one more entry in this one, I assume the first is the BTC weight, and the rest is in the same order as in the first log entry, right

I think you are right.

srozov commented 6 years ago

I just realised that the [self._steps] does not represent the same time step t in __get_matrix_X() and __get_matrix_y(), as X = M[:, :, :, :-1] takes a sliding window with all but the last value for the present step, but y = M[:, :, :, -1] / M[:, 0, None, :, -2] uses data from the next (future) and present step. Is that so? Please correct me if I'm wrong. I find it confusing to calculate the price change at the end of the present step by using future data and not in the beginning of the next step by using data from present and past...

ZhengyaoJiang commented 6 years ago

I find it confusing to calculate the price change at the end of the present step by using future data and not in the beginning of the next step by using data from present and past...

I'm not sure if I understand your question correctly, while I will try to answer that. Because you need to use the future price (y) to calculate the reward in this period.