Closed shivaraj1994 closed 3 years ago
Did you find a workaround for this? I am having the same problem as well ...
Hello,
I don't have the full picture on what scores are included in the dataset; however, I'm guessing one of two possibilities:
gain_type='identity'
in the NDCG constructor)One could restructure the problem to be in log space, avoiding the exponential. This is how many numerical algorithms deal with exponentials (and potential overflow).
That's indeed what the above bit of code does :)
I am trying train using MQ2007-list dataset.
with open('/home/shivaraj/Downloads/MQ2007-list/Fold1/train.txt') as trainfile, \ open('/home/shivaraj/Downloads/MQ2007-list/Fold1/vali.txt') as valifile, \ open('/home/shivaraj/Downloads/MQ2007-list/Fold1/test.txt') as evalfile: TX, Ty, Tqids, T_ = pyltr.data.letor.readdataset(trainfile) VX, Vy, Vqids, V = pyltr.data.letor.readdataset(valifile) EX, Ey, Eqids, E = pyltr.data.letor.read_dataset(evalfile)
metric = pyltr.metrics.NDCG(k=10)
Only needed if you want to perform validation (early stopping & trimming)
monitor = pyltr.models.monitors.ValidationMonitor( VX, Vy, Vqids, metric=metric, stop_after=250)
model = pyltr.models.LambdaMART( metric=metric, n_estimators=1000, learning_rate=0.02, max_features=0.5, query_subsample=0.5, max_leaf_nodes=10, min_samples_leaf=64, verbose=1, )
model.fit(TX, Ty, Tqids, monitor=monitor)
This error log--
OverflowError Traceback (most recent call last)