Compare the weight matrixes (Tensorflow vs Pytorch)

Set random seeds to get same initial states and random numbers

torch.manual_seed(0)
torch.cuda.manual_seed(0)
np.random.seed(0)

Layer shape comparison

Gaussian actor

:x: Dense1.weight: (2, 64) - [64, 2]
:heavy_check_mark: Dense1.bias: (64,) - [64]
:heavy_check_mark: Dense2.weight: (64, 64) - [64, 64]
:heavy_check_mark: Dense2.bias: (64,) - [64]
:x: mu.weight: (64, 2) - [2, 64]
:heavy_check_mark: mu.bias: (2,) - [2]
:x: log_Sigma.weight: (64, 2) - [2, 64]
:heavy_check_mark: log_sigma.bias: (2) - [2]

Lyapunov critic

:heavy_check_mark: w1_s: (2, 128) - [2, 128]
:heavy_check_mark: w1_a: (2, 128) - [2, 128]
:heavy_check_mark: b1: (1, 128) - [1, 128]
:heavy_check_mark: Dense1.weights: (128, 128) - [128, 128]
:heavy_check_mark: Dense1.bias: (128,) - [128]

Important difference between Tensorflow and Pytorch

As can be seen above the shape of some of the layers differs this is due to the fact that in Pytorch the weight matrix is transposed compared to tensorflow (see this issue). This is as expected since Tensorflow uses the dot product where pytorch uses a matrix vector product for producing the outputs. When initializing weights using the same random seed you therefore have to use the following for pytorch:

torch.manual_seed(0)
self.net[0][0].weight = nn.Parameter(torch.randn(self.net[0][0].weight.shape, requires_grad=True))
self.net[0][0].bias = nn.Parameter(torch.randn(self.net[0][0].bias.shape, requires_grad=True))

and for tensorflow:

torch.manual_seed(0)
w_init_net_0 = tf.constant_initializer(torch.transpose(torch.randn((n1, s.shape[1].value)).numpy(),0,1))
b_init_net_0 = tf.constant_initializer(torch.transpose(torch.randn((n1)).numpy(),0,1))
net_0 = tf.layers.dense(s, n1, activation=tf.nn.relu, name='l1', bias_initializer=b_init_net_0, kernel_initializer=w_init_net_0, trainable=trainable)

When working with manually created layers you have to use the following syntax in pytorch:

torch.manual_seed(5) # FIXME: Remove random seed
self.w1_s = nn.Parameter(torch.randn((obs_dim, n1), requires_grad=True))
self.w1_a = nn.Parameter(torch.randn((act_dim, n1), requires_grad=True))
self.b1 = nn.Parameter(torch.randn((1, n1), requires_grad=True))
input_out = F.relu(torch.matmul(obs, self.w1_s) + torch.matmul(act, self.w1_a) + self.b1)

and in tensorflow:

torch.manual_seed(5)
w1_s_init = tf.constant_initializer(torch.randn((self.s_dim, n1)).numpy())
w1_a_init = tf.constant_initializer((torch.randn((self.a_dim, n1)).numpy())
b1_init = tf.constant_initializer(torch.randn((n1)).numpy())
layers = []
w1_s = tf.get_variable('w1_s', [self.s_dim, n1], initializer=w1_s_init, trainable=trainable)
w1_a = tf.get_variable('w1_a', [self.a_dim, n1], initializer=w1_a_init, trainable=trainable)
b1 = tf.get_variable('b1', [1, n1], initializer=b1_init ,trainable=trainable)
net_0 = tf.nn.relu(tf.matmul(s, w1_s) + tf.matmul(a, w1_a) + b1)
layers.append(net_0)

Check initial weights in tensorflow

The following code can be used to checkout the network weights in tensorflow (see this question):

vars = tf.trainable_variables()
print(vars) #some infos about variables...
vars_vals = self.sess.run(vars)

In 877906f11a8d5fabb24929e8168ec6a278b0291f the weights, biases and seeds were set to be equal to aid in a comparison between tensorflow and pytorch.

Compare networks

Gaussian Actor

Pytorch:

Pytorch_ga_graph

Tensorflow:

Lyapunov Critic