[question] How to implement custom policy for TRPO

Hello,

I am training a TRPO with MlpPolicy on a custom environment, but now I would like to implement a custom policy. However, I can not find any examples for this agent with a custom policy. So far I have tried this policy:

class CustomPolicyPR(FeedForwardPolicy):
    def __init__(self, *args, **kwargs):
        super(CustomPolicyPR, self).__init__(*args, **kwargs,
                                            net_arch=[128, dict(pi=[256,128,64],
                                                               vf=[256,128,64])],
                                                        feature_extraction="mlp")

The problem is that when I see the model graph on TensorBoard, some of the layers appear to be disconnected. Can anyone provide me with an example of how to do this or tell me what I am doing wrong?

My intention is to create a similar policy to MlpPolicy, but modifying the number of layers and neurons.

I truly appreciate your help and time. I also apologize in case I have not followed well the documentation.

hill-a / stable-baselines

[question] How to implement custom policy for TRPO #1119