Hyperparameter famillies

Right now we have separate

dropout probabilities
learning rates

for the two parts of the model you might call the "encoder" and the "classifier". However, the latter is called "overall" in flags/doc names, which is misleading since those parameters are not applied overall at all...but rather just to the classifier layers. Therefore we should do a search-and-replace for overall_ and replace it with classifier_.

CUNY-CL / udtube

Hyperparameter famillies #33