For probing tasks, we used an MLP with a Sigmoid nonlinearity and and tuned the nhid (in [50, 100, 200]) and dropout (in [0.0, 0.1, 0.2]) on the dev set.
However, in the code it looks like the parameters given by the user are always used. No tuning takes place and no predefined hyperparameters are loaded. Maybe I missed something?
Should I do hyperparameter tuning to get results that are comparable to the literature?
The readme states
For probing tasks, we used an MLP with a Sigmoid nonlinearity and and tuned the nhid (in [50, 100, 200]) and dropout (in [0.0, 0.1, 0.2]) on the dev set.
However, in the code it looks like the parameters given by the user are always used. No tuning takes place and no predefined hyperparameters are loaded. Maybe I missed something?
Should I do hyperparameter tuning to get results that are comparable to the literature?