remove future leak - Githubissues

daxiongshu commented 9 years ago

Hi, when you said that you remove future leak from scaler, could you point me to the code?

in line 68 and 82 of vali1_new_sub.py, it seems the scaler still fits_transform the entire series, is it?

SudalaiRajkumar commented 9 years ago

Hi Carl,

In line number 68, we are using a fit_transform and so we are fitting the scaler on the whole train sample and get the mean and SD and transforming it

But in line number 82, we are just using transform and so we are just using the calculated mean and SD to directly transform the test set. If we use fit_transform here in test set, then it will be future leakage since we are using the whole of test(including future value) to compute the mean and SD. Since we are just transforming we are not using any future values and hence it is legit.

Please let me know if something is wrong in my reasoning. Then we can go and correct it.

Thank you.

On Friday, August 14, 2015, Jiwei Liu notifications@github.com wrote:

Hi, when you said that you remove future leak from scaler, could you point me to the code?

in line 68 and 82 of vali1_new_sub.py, it seems the scaler still fits_transform the entire series, is it?

— Reply to this email directly or view it on GitHub https://github.com/daxiongshu/Grasp-and-Lift/issues/4.

SudalaiRajkumar commented 9 years ago

Carl, Is the reasoning fine?

I have made a submission after adding the electrode 30 to my previous version and changing the params of xgb. I have updated the scripts in the github. It improved the LB score by 0.004.

daxiongshu commented 9 years ago

hi, thank you. yeah, it is definitely correct and very smart. I should notice it earlier. I'm running 20 base models now and try to exploit the idea of reinforced stacking to its extreme. Let you know later.

Best

SudalaiRajkumar commented 9 years ago

That is great.! Thank you.

daxiongshu commented 9 years ago

Hey, no luck so far. I'm trying to analyze each subject now to see if we need to calibrate their predictions.

SudalaiRajkumar commented 9 years ago

Please let me know what I could do next.

On Saturday, August 15, 2015, carl notifications@github.com wrote:

Hey, no luck so far. I'm trying to analyze each subject now to see if we need to calibrate their predictions.

— Reply to this email directly or view it on GitHub https://github.com/daxiongshu/Grasp-and-Lift/issues/4#issuecomment-131398464 .

daxiongshu commented 9 years ago

Hey, I don't really have any good ideas for now. Maybe you can spend more time on other contests. I'll also start with coupon now.

Best regards,

daxiongshu commented 9 years ago

per_subject_auc

SudalaiRajkumar commented 9 years ago

Oh okay. I will try to do something.

Can you please share the CV results file of neural networks. And also the submission file.

On Saturday, August 15, 2015, carl notifications@github.com wrote:

[image: per_subject_auc] https://cloud.githubusercontent.com/assets/3236827/9290061/fa30007c-4351-11e5-9487-837428474250.png

— Reply to this email directly or view it on GitHub https://github.com/daxiongshu/Grasp-and-Lift/issues/4#issuecomment-131403191 .

daxiongshu / Grasp-and-Lift

remove future leak #4