astorfi / lip-reading-deeplearning

:unlock: Lip Reading - Cross Audio-Visual Recognition using 3D Architectures
Apache License 2.0
1.84k stars 321 forks source link

Visual Network - Input Size #16

Closed ghost closed 6 years ago

ghost commented 6 years ago

How to change the input size of visual network to 112x112 (square) as an example?

ghost commented 6 years ago

I solved the problem by changing this line

net = slim.repeat(net, 1, slim.conv2d, 256, [1, 2, 5], padding='VALID', scope='fc5')

to this

net = slim.repeat(net, 1, slim.conv2d, 256, [1, 5, 5], padding='VALID', scope='fc5')

astorfi commented 6 years ago

@nooshin85 Ok sounds good ... Please make sure to adjust the network architecture regarding your input size.