alegonz / baikal

A graph-based functional API for building complex scikit-learn pipelines.
https://baikal.readthedocs.io
BSD 3-Clause "New" or "Revised" License
592 stars 30 forks source link

Operation on predictions in a stack #14

Closed AntonBiryukovUofC closed 3 years ago

AntonBiryukovUofC commented 4 years ago

Hello @alegonz , thanks for such a cool tool !

I was wondering if the following is possible in the current implementation of baikal, or maybe there's a short hack to make it happen.

I am trying to build a stacked classifier model on sequences (think of timeseries where I can look in the future as well), and have an LGBM and RF as a first-level prediction. My series are actually grouped (N groups), and are of equal length within a group (M points). The labels are provided on a per-point basis.

What i would like to do is to calculate the OOF first level of probabilities, and then compute lagged features of those probabilities, concatenate them together, and then build a second level meta classifier on that data.

Do you think this is possible as of now ?

alegonz commented 4 years ago

@AntonBiryukovUofC Sorry for the late reply. Haven't been able to follow up on this project for the past week.

Thank you for your interest in this project! :)

I don't understand 100% the data structure you're using, but since you mention calculating OOF probabilities in the first level, it looks like you're using a protocol similar to the one described in this issue. Does by any chance the discussion on that issue answer your question? Summarizing that issue thread, at present, these kind of protocols using OOF predictions on the first level can be implemented albeit in a kinda manual way, but I'm considering extending the API to handle such protocols in a simpler, shorter way.

alegonz commented 3 years ago

Closing due to inactivity. Feel free to re-open if you have further questions.