Closed michelole closed 5 years ago
May not be enough to reduce number of features and avoid overfit, we should prioritize #103 .
Experiments showed we are able to reach similar metrics when using a smaller layer size, but models converge linearly faster.
Fixed by 6b541f8488ec5901988e132e27d9d6f60944ab00
From 256 to e.g. 128 or 64.