dmlc / treelite

Universal model exchange and serialization format for decision tree forests
https://treelite.readthedocs.io/en/latest/
Apache License 2.0
735 stars 100 forks source link

Three parameters in predict function stands for what? #128

Closed karimkhanvi closed 5 years ago

karimkhanvi commented 5 years ago

I converted my python code in C using treelite. In my python code I need only one data to be passed to model.

But in generated code I see

float predict_multiclass(union Entry* data, int pred_margin, float* result) {
...
}

it requires three parameteres. (I am not good with C, so I appreciate layman explanation)

Also, there is no main function in the generated code. So I created as below.

int main() 
{ 
    int num1;
    int num2;
    float f;
    scanf("%d", &num1);
    scanf("%d", &num2);
    scanf("%f", &f);

    union Entry t; 

    t.missing=num1;
    t.fvalue=f;
    t.qvalue=num2; 
    float output;  

    output=predict_multiclass(&t,10,0);
    printf("%f",output);
    return 0;
} 

Also, what values should I give for below parameters

  t.missing=num1;
    t.fvalue=f;
    t.qvalue=num2
hcho3 commented 5 years ago

See https://treelite.readthedocs.io/en/latest/tutorials/deploy.html#id7.

Your model has an extra parameter “results” because it is a multiclass classifier and it needs to output multiple probabilities.

karimkhanvi commented 5 years ago

Thanks @hcho3 Above link helped a lot. There is no mention forqvalue in tutorial.

Can you please tell me what is that?

hcho3 commented 5 years ago

@karimkhanvi qvalue is used internally by the compiled lib and not meant to be used by your application.

karimkhanvi commented 5 years ago

Thanks @hcho3 Sorry, this may look silly question but I am very naive on this!

I understand the code bit better now. So I assigned -1 to missing parameter, qvalue is not in use, fvalue is by default 44.

As C program is not able to generate the features, I got it running python script. Which is here -

[6.264433270291558e-05, 2.176748170790401, 10.26869444705082, -9882.732180966965, -9001.28169022952, -11777.410784241983, -15504.294610698811, 0.12349022572211187, 0.9770779125345253, 0.20953658704264688, 0.8384903619153062, -88.29940587560249, 501.5713752989375, -134.57046479899873, 22.23407175578494, 15.535714285714286, 37.0, 15.232142857142858, 30.0, 0.40044642857142854, 0.55, 0.5351190476190476, 0.7333333333333333, 0.28317973739770985, 0.6264075397758475, 0.371472435944516, 0.47473190264229803, 0.13709272941939987, 0.010747832919949357, 0.00012289230107976275, 0.00023095432920910807, 0.0007216495221745678, 14.807711658248234, 1.1449448983946187, 2.0882342936278144, 1.6326947745983675, 4.692736022962345, 4.360735418549686, 0.13218915294781566, 11.26666244222763, 0.0035081125101797906, 3969.7239666170185, 0.2870397542852062, 0.14329799356286124]

Now my question is how can I assign this values to data parameter?

Also above array has exponential values and precision up to 12 values. Will C code able to understand or do I need to change the data type as well.

Here is the generated C code - https://gist.github.com/karimkhanvi/f48c4f59b96e4e707b74828c5414efe6

hcho3 commented 5 years ago

If you are using Python, can you use the Treelite runtime?

karimkhanvi commented 5 years ago

Yes, I can use Treelite runtime. But I see Treelite takes time to run.

What is the benefit of using Treelite in runtime?

hcho3 commented 5 years ago

Mainly the ease of use.

karimkhanvi commented 5 years ago

okay, Can you please answer question in my previous comment https://github.com/dmlc/treelite/issues/128#issuecomment-543567063?

karimkhanvi commented 5 years ago

Sorry to bug you. I could create main funciton which reads python features from text file and predict the result.

But I am still left in generating features in C code. Thanks @hcho3

hcho3 commented 5 years ago

I was about to suggest that as well. Yes, for best performance , you’d want to do feature generation in C too. The communication between C and Python code adds some overhead.

On Fri, Oct 18, 2019 at 3:14 PM karimkhanvi notifications@github.com wrote:

Sorry to bug you. I could create main funciton which reads python features from text file and predict the result.

But I am still left in generating features in C code. Thanks @hcho3 https://github.com/hcho3

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/dmlc/treelite/issues/128?email_source=notifications&email_token=AATKM5JQCB4BUGRJCYWPZLDQPFV5HA5CNFSM4I7ZDVU2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBTIITQ#issuecomment-543589454, or unsubscribe https://github.com/notifications/unsubscribe-auth/AATKM5ORZYI6M6HS2VGTZT3QPFV5HANCNFSM4I7ZDVUQ .