microsoft / SynapseML

Simple and Distributed Machine Learning
http://aka.ms/spark
MIT License
5.06k stars 831 forks source link

[Fatal] Tree model should contain cat_threshold field for lgbm #824

Open arijeetm1 opened 4 years ago

arijeetm1 commented 4 years ago

Describe the bug

Observing the error while calculating the ndcg scores [Fatal] Tree model should contain cat_threshold field for lgbm

Not sure how in mmlspark inferences the value expected here: https://github.com/microsoft/LightGBM/blob/master/src/io/tree.cpp#L622 Is there some default threshold?

It would be great to point out to some documentation as well.

To Reproduce following are my model params:

model = LightGBMRegressor(
                boostingType = 'gbdt',
                isProvideTrainingMetric=True,
                maxBin = 255,
                numIterations = 500,
                learningRate = 0.3,
                numLeaves = 127,
                earlyStoppingRound = 20,
                #parallelism = 'serial',
                #num_threads = 8
                featureFraction = 0.5,
                baggingFreq = 1,
                baggingFraction = 0.8,
                #min_data_in_leaf = 20 
                minSumHessianInLeaf = 0.001,
                categoricalSlotIndexes=[4,5,6,7,8,9,10,11,12,13,14,15,16]
)

Expected behavior A clear and concise description of what you expected to happen.

Info (please complete the following information):

Stacktrace

Please post the stacktrace here if applicable

If the bug pertains to a specific feature please tag the appropriate CODEOWNER for better visibility

Additional context Add any other context about the problem here.

seasidemym commented 3 years ago

The same problem, is there any suggestion?