PGPortfolio: Policy Gradient Portfolio, the source code of "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem"(https://arxiv.org/pdf/1706.10059.pdf).
I'm trying to understand about the nonlinearity in your network (except for the final softmax): in the current version of net_config.json, you list the 3 layers of the network as one ConvLayer, one EIIEDense and one EIIE_Output_WithW. In neither of them there is an entry to specify the activation function, and the default one of conv_2d in TFLearn is 'linear', so are you using a nonlinearity between the layers?
I'm trying to understand about the nonlinearity in your network (except for the final softmax): in the current version of
net_config.json
, you list the 3 layers of the network as oneConvLayer
, oneEIIEDense
and oneEIIE_Output_WithW
. In neither of them there is an entry to specify the activation function, and the default one of conv_2d in TFLearn is 'linear', so are you using a nonlinearity between the layers?