ZhengyaoJiang / PGPortfolio

PGPortfolio: Policy Gradient Portfolio, the source code of "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem"(https://arxiv.org/pdf/1706.10059.pdf).
GNU General Public License v3.0
1.73k stars 748 forks source link

Conversion of globaldatamatrix panel to Dataframe (panel is deprecated) #137

Open aushaff opened 3 years ago

aushaff commented 3 years ago

Hi

As stated in the title I am in the process of converting the panel in the globaldatamatrix function to a multi-index dataframe but could do with some clarification/assistance (hopefully you are still watching this repo).

What I have done so far:

In get_global_panel:

L76:

     panel = pd.Panel(items=features, major_axis=coins, minor_axis=time_index, dtype=np.float32)

to

    new_panel = pd.DataFrame(
            index = pd.MultiIndex.from_product([coins, time_index]),
            columns = features
        )

L133:

      panel.loc[feature, coin, serial_data.index] = serial_data.squeeze()

to

      new_panel.loc[(coin, serial_data.index), feature] = serial_data.values

This works but is very very slow...

After this 'new_panel' is returned to datamatrices: 48

     self.__global_data = self.__history_manager.get_global_panel(start, self.__end, period=period, features = type_list)

L58:

    self.__PVM = pd.DataFrame(index=self.__global_data.minor_axis, columns=self.__global_data.major_axis)

to

    self.__PVM = self.__global_data

Is this equivalent? I'm not sure of the structure of the PVM; is it the same as the structure of 'new_panel'?

L64:

    self._num_periods = len(self.__global_data.minor_axis)

to

    self._num_periods = len(self.__global_data.index)

Is that corrrect? my assumption here is that we want the number of 30 minute periods not the number of rows?

When I run the program with 'mode=download_data' this goes but as mentioned it takes a long time.

In backtest mode:

I am confused by datamatrices::get_submatrix and datamatrices::__pack_samples... due to the change of data structure the indices are wrong (there are too many for a start).

Can you provide some guidance as to how to approach this please?

I appreciate that this is an old repo and maybe you aren't supporting it any more but any advice would be welcome! Thanks!

aushaff commented 3 years ago

to update:

I have everything working now, on a different machine, with pandas version 0.24 so will be able to compare the data structures. Hopefully that will be enough.

bjrnfrdnnd commented 3 years ago

I am using xarray.DataArray for that. Relatively minor code changes.

aushaff commented 3 years ago

Thanks. It's good to hear that someone has already done it.

If you have time could you be a bit more specific as to what you needed to change please?

Nice-Zhang66 commented 1 year ago

I am using xarray.DataArray for that. Relatively minor code changes.

Hello, can you guide how to change the replace panel method with xarray.DataArray, I have been having problems after changing it. fei'chang'gan'xiThank you very much!