Currently, we use the FOBOS update in the gradient step which seems to be efficient. However it would be great to test other gradient steps too, because I am not sure that the FOBOS will be the winner in every case.
Therefore we should somehow make use of the old step function class. In this way, first, we could avoid the copy-paste of the code. In the current version, we have at least 4 different copies of the FOBOS update. Second, the code would be more modular and easier to implement new update rules.
In my opinion we should support the multi-label feature hashing in the implementation of the update rule. I think we are going to deal with large number of features and labels, therefore it does not make too much sense having an implementation without feature hashing.
Currently, we use the FOBOS update in the gradient step which seems to be efficient. However it would be great to test other gradient steps too, because I am not sure that the FOBOS will be the winner in every case.
Therefore we should somehow make use of the old step function class. In this way, first, we could avoid the copy-paste of the code. In the current version, we have at least 4 different copies of the FOBOS update. Second, the code would be more modular and easier to implement new update rules.
In my opinion we should support the multi-label feature hashing in the implementation of the update rule. I think we are going to deal with large number of features and labels, therefore it does not make too much sense having an implementation without feature hashing.
What is your opinion regarding this issue?