Open daxiongshu opened 9 years ago
Hi Carl,
In line number 68, we are using a fit_transform and so we are fitting the scaler on the whole train sample and get the mean and SD and transforming it
But in line number 82, we are just using transform and so we are just using the calculated mean and SD to directly transform the test set. If we use fit_transform here in test set, then it will be future leakage since we are using the whole of test(including future value) to compute the mean and SD. Since we are just transforming we are not using any future values and hence it is legit.
Please let me know if something is wrong in my reasoning. Then we can go and correct it.
Thank you.
On Friday, August 14, 2015, Jiwei Liu notifications@github.com wrote:
Hi, when you said that you remove future leak from scaler, could you point me to the code?
in line 68 and 82 of vali1_new_sub.py, it seems the scaler still fits_transform the entire series, is it?
— Reply to this email directly or view it on GitHub https://github.com/daxiongshu/Grasp-and-Lift/issues/4.
Carl, Is the reasoning fine?
I have made a submission after adding the electrode 30 to my previous version and changing the params of xgb. I have updated the scripts in the github. It improved the LB score by 0.004.
hi, thank you. yeah, it is definitely correct and very smart. I should notice it earlier. I'm running 20 base models now and try to exploit the idea of reinforced stacking to its extreme. Let you know later.
Best
That is great.! Thank you.
Hey, no luck so far. I'm trying to analyze each subject now to see if we need to calibrate their predictions.
Please let me know what I could do next.
On Saturday, August 15, 2015, carl notifications@github.com wrote:
Hey, no luck so far. I'm trying to analyze each subject now to see if we need to calibrate their predictions.
— Reply to this email directly or view it on GitHub https://github.com/daxiongshu/Grasp-and-Lift/issues/4#issuecomment-131398464 .
Hey, I don't really have any good ideas for now. Maybe you can spend more time on other contests. I'll also start with coupon now.
Best regards,
Oh okay. I will try to do something.
Can you please share the CV results file of neural networks. And also the submission file.
On Saturday, August 15, 2015, carl notifications@github.com wrote:
[image: per_subject_auc] https://cloud.githubusercontent.com/assets/3236827/9290061/fa30007c-4351-11e5-9487-837428474250.png
— Reply to this email directly or view it on GitHub https://github.com/daxiongshu/Grasp-and-Lift/issues/4#issuecomment-131403191 .
Hi, when you said that you remove future leak from scaler, could you point me to the code?
in line 68 and 82 of vali1_new_sub.py, it seems the scaler still fits_transform the entire series, is it?