MAIF / melusine

📧 Melusine: Use python to automatize your email processing workflow
https://maif.github.io/melusine
Other
352 stars 58 forks source link

The get_meta return no columns in tutorial_7 #143

Closed benoitLebreton-perso closed 10 months ago

benoitLebreton-perso commented 2 years ago

Hello Melusine,

I face a problem with the _get_meta method but I think I have a solution.

_get_meta() function does not work because the columns are never selected at the following line https://github.com/MAIF/melusine/blob/10424aaad749327c9ec405802adcb967c41a168f/melusine/models/train.py#L571-L573

indeed each element of column_list has '__' at the end because of the following (to select the dummified columns only) https://github.com/MAIF/melusine/blob/10424aaad749327c9ec405802adcb967c41a168f/melusine/models/train.py#L564

On the following image I'm using the debugger in the tutorial_7_models.ipynb with a breakpoint in the _get_meta method And we can see that even if we want to use : meta_input_list=['extension','attachment_type', 'dayofweek', 'hour', 'min'], meta_input_list is empty and it is because we don't have the dummified columns (we have the original columns).

image

This is all because the columns are not Dummyfied at this step and it should be (to have [extension__1, extension__2, ...] columns) In this case, this tutorial is missing an important step : the encoding of the meta

In fact there is no problem in tutorial09_full_pipeline_quick.ipynb because it has dummified the meta.

So I suggest : A/ either we assume the tutorial07 is not using meta and we set `meta_input_list=[] B/ or we should add the Meta columns to the dataset for this tutorial

Sentiments mutualistes ;)

HugoPerrier commented 10 months ago

New melusine v3.0.0 is out. Closing legacy issues