Open segalinc opened 1 year ago
Hi Cristina,
CatboostObjective
(and its colleagues LightGbmObjective
and XgboostObjective
) calculates the gradients for you - your loss function should return the scalar torch tensor loss
, not numpy grads.
However, it looks like you're using the package for multiregression, which is a use case that I didn't consider. I'll try to create a working example :)
Hi Tomer, thank you for the quick reply. Not sure where it says I am using the multiregression. My target is a simple regression Maybe I misunderstood the need of computing the gradient I just need to stop at the spearman computation and return that coeff
I just tried that and I get the same error
I debugged it and got the same error. This is the line that causes it:
ypred_th = torch.tensor(ypred.reshape(1, lenypred), requires_grad=True)
Your loss function receives ypred
as a tensor with requires_grad=True
. When you create a different tensor, and use the new tensor to calculate the loss, the original ypred
isn't involved in the computation graph anymore, and thus doesn't get gradients.
You can fix it by changing this line to ypred_th = ypred.reshape(1, lenypred)
.
On a different note, if you're using correlation as the loss for tree booster regression, you may get unexpected results because of the "divide and conquer" mechanism of trees. If you print the length of ypred
inside spearman_loss
, you'll see that the number of examples isn't constant - this is because CatBoost calls the loss function for every node inside every tree, which means the loss function only operates on a subset of the dataset every time. I think that correlation can be a problematic measure of similarity for small sample sizes. Maybe it would work better if you use a large amount of very shallow trees.
Also, you probably want to use loss = -spearman(...)
since in treeboost_autograd
the loss is being minimized.
Hope this helps!
Hi Tomer,
thanks for the hint, really appreciated
My dataset is very big so no big issue on that side.
You are right I forgot to do 1- coeff
as actual loss, I saw that one while I was re-checking what I posted here.
I will try the fix and see if this works better and in case close the issue
Hi Tomer,
I was trying your example in the repo and even with the fix I get a new error
The same error happens also if a use the absolute_error_loss
from your blog post
File "_catboost.pyx", line 1399, in _catboost._ObjectiveCalcDersRange
File "/apps/python3/lib/python3.7/site-packages/treeboost_autograd/booster_objectives.py", line 41, in calc_ders_range
deriv1, deriv2 = self.calculate_derivatives(preds, targets, weights)
File "/apps/python3/lib/python3.7/site-packages/treeboost_autograd/pytorch_objective.py", line 25, in calculate_derivatives
deriv1, deriv2 = self._calculate_derivatives(objective, preds)
File "/apps/python3/lib/python3.7/site-packages/treeboost_autograd/pytorch_objective.py", line 35, in _calculate_derivatives
deriv1, = torch.autograd.grad(objective, preds, create_graph=True)
File "/apps/python3/lib/python3.7/site-packages/torch/autograd/__init__.py", line 228, in grad
inputs, allow_unused, accumulate_grad=False)
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn```
Seems like the custom loss function want as arguments (preds:Tensor, target:Tensor)
to work
Seems like the custom loss function want as arguments
(preds:Tensor, target:Tensor)
to work
That's true - the custom loss should expect its inputs to be torch tensors. This is what the implementation of CatboostObjective does: convert numpy.ndarray to torch.Tensor, call the loss, calculate 1st and 2nd order grads, convert back from torch.Tensor to numpy.ndarray.
Tell me if it still doesn't work, and I'll try to create a full working example with your loss.
I worked a bit more and yeah still not working unfortunately. I might be missing something in your repo. I appreciate the help :)
Hi there, I am trying to use your tool to create a Spearman R custom loss to use for CatBoostRegressor. However I get that this error regarding
calculate_derivatives
Code of loss function