Closed Rubinjo closed 1 year ago
Hi @Rubinjo, thanks for the contribution.
Adding a model_wo_output_activation
function sounds ok to me, however this PR needs three changes:
model_wo_softmax
since this would break our users' existing code. We recently had a major breaking release and this minor change is not worth another one.model_wo_output_activation
is supposed to remove any activation function, an implementation that doesn't require the name of the activation function as an argument would be more elegant.Yeah, it is indeed a pretty rigorous change. I mainly made the change not to duplicate the model_wo_softmax
function into multiple new similar functions. I can also implement it by only refactoring the pre_softmax_tensors
method, then your first point is still satisfied and no updates to the readme and notebooks are needed.
I would leave model_wo_softmax
as is and add a model_wo_output_activation(model)
function that doesn't require a string. All other changes besides added tests should be reverted.
Yeah I agree. Preserving model_wo_softmax
and adding a model_wo_output_activation(model)
should be the end result.
I have adjusted pre_softmax_tensors
so now it can still use model_wo_softmax
without any adjustment for the user and added a model_wo_output_activation(model)
that can also use this function.
Now all three points from your earlier message should be satisfied.
Looks good to me. Sorry for the delay! :)
I'm working on a binary classification problem and therefore have a sigmoid activation instead of a softmax activation function for my output layer. I have adjusted the
model_wo_softmax
function to accept any kind of activation by giving it as an argument to also cover binary classification problems.See here the old vs the new call:
I have also edited all examples and documentation that cover this function, so everything should now have this new function.