Open tholmb opened 3 years ago
I found the answer to the question 1. from the provided codes. So, convolutional kernels are initialized with xavier initialization and biases with trundated normal initialization (mean 0.0 and std 1.0).
# Conv2D
weights = tf.get_variable(
"weights",
kernel_shape,
initializer=tf.contrib.layers.xavier_initializer(),
dtype=tf.float32,
)
biases = tf.get_variable(
"biases",
bias_shape,
initializer=tf.truncated_normal_initializer(),
dtype=tf.float32,
However I have a new question about the network architecture.
Hi, how did your training go?
First of all, thanks for sharing the great project! I have tried to implement your mobilePydnet network but cannot reach totally to the same results compared with pre-trained model. For that reason I have several questions about the model, loss, data and training itself.
Did you initialize weights and biases by using some particular initialization strategy or did you just use the default initialization of convolution layers?
Did you use any data augmentation like flipping, rotating, random cropping or blurring?
You told here in the issues section that your range of input and output images are [0,255]. Does it mean that in the training when you load input image and ground truth as float32, you don't normalize them for example by dividing 255 to range [0,1]?
The loss is described in the paper as: , where is fixed to 1 and goes from 0.5, 0.25, 0.125 (if I understood correctly you just used 3 different scales). Here is the python code for calculating the loss but I'm not sure if I am missing something: