model structure differences between ensemble/stacked/ensemble_stacked

Tonywhitemin commented 2 years ago

Good day! I am reading your manual now but can't tell the model structure differences between ensemble/stacked/ensemble_stacked...

Following pictures are json files from the example code and the questions are listed below, could you please help to answer them?

The meaning of "repeat" here.
How can I understand the model structure for these three pictures?

ensemble.json

Optuna_extratrees_stacked/framework.json

Ensemble_stacked/ensemble.json

Best regards

pplonski commented 2 years ago

Ensemble is the set of models (not stacked). Model can be stacked if learned with predictions from previous models (not stacked) . Stacked ensemble is set of models stacked and not stacked (all available).

The repeat is the weight of the model in the ensemble.

Did you get good results with AutoML?

Tonywhitemin commented 2 years ago

Hi pplonski Thanks for your quick reply!

For stack models, I still want to check how will it stack? For example, following picture shows that it use five model to stack : original_LightGBM, original_Xgboost, original_Neural Network, original_Random Forest and original_Extra Trees.

Do they stack sequentially as picture below?

Another question is about weight, the graph below shows that the sum of weights is 23+21+1+7+1+10+40=103. Is this normal? (more than 100%). Or am I misunderstanding the rules?

So far I just use the sample code to practice this AutoML tool, the result is good to me. I will use it on the medical dataset when I understand this tool better. Thanks again for your kindness!

pplonski commented 2 years ago

Hi @Tonywhitemin!

In the first picture you selected models that were trained with the Optuna framework.
Here is a definition of stacking https://en.wikipedia.org/wiki/Ensemble_learning#Stacking - in MLJAR AutoML the best models are selected and their predictions are added to the original data. Such data is used to train stacked models.
In the ensemble there are weights. After summing all predictions in the ensemble they are normalized with the total weight sum. So it can be any total value.

I'm happy to help. Good luck with your data.

Tonywhitemin commented 2 years ago

Hi @pplonski , Thanks for your help! As you said, "in MLJAR AutoML the best models are selected and their predictions are added to the original data. Such data is used to train stacked models." So the five models listed in the picture below were "the best models" as you mentioned, is that correct? If above is correct, what will the final "stacked models" algorithms be?

pplonski commented 2 years ago

Can you post the full framework.json file? There might be more models stacked, for example with golden features.

There should be information in framework.json file about which algorithms are used for stacking.

Tonywhitemin commented 2 years ago

Hi @pplonski, framework.txt.txt Please refer to attachment from the path of "Optuna_extratrees_stacked/framework.json" If there are some information indicated in this file and I missed, tell me please. Thanks for your time!

pplonski commented 2 years ago

There should be params.json or framework.json file in the main directory. There should be a separate file with info about golden features as well. Could you send them? Thanks!

Tonywhitemin commented 2 years ago

Hi @pplonski Following attachment is the params.json file showed as picture below for your reference. (BTW, in optuna mode, it doesn't have golden_feature.json file in it) params.txt.txt

Part of the file infomation showed as below, but still can't understand the structure for the final stacked model... Could you help with that? Thanks!

pplonski commented 2 years ago

Here is how the data for stacked models looks like (based on your params):

Original input data plus predictions from previous models are concatenated and form a new input vector.

Tonywhitemin commented 2 years ago

Thanks for your picture, @pplonski ! I would like to check if the stacked model explainable? If yes, what algorithm will the final stacked model select?

pplonski commented 2 years ago

The stacked can be explainable but it is not implemented in MLJAR AutoML.

I don't understand the second question.

Tonywhitemin commented 2 years ago

Sorry for the confusion... The second question is that , for example: Following picture is a reference journal from link of: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0205872 They also used stack model level 1,and they chose support vector machine as the level-1 model algorithm. Could we know the level-1 stacked model's algorithm in MLJAR-AutoML?

pplonski commented 2 years ago

There can be several different algorithms at Level-1 (from the image above). For example:

1_Optuna_LightGBM_Stacked - means that LightGBM algorithm was trained with stacked data,
5_Optuna_ExtraTrees_Stacked - means that ExtraTree algorithm was trained with stacked data, so you have several models at Level-1.

Then you have the next level Level-2 (nor in the image) - it is Ensemble_Stacked it ensembles all available models from Level-0 and Level-1.

Tonywhitemin commented 2 years ago

I really appreciate your help, @pplonski ! Now I understand! Thanks for your time!

pplonski commented 2 years ago

@Tonywhitemin would you like to help improve MLJAR AutoML docs? The docs are here https://github.com/mljar/docs and are written in Markdown. There might be a separate page about stacking, what do you think? :)

Tonywhitemin commented 2 years ago

If you don't mind my English expression, I would like to give it a try! Could you tell me how can I help? :)

pplonski commented 2 years ago

Let's create a page in docs How does ensemble stacking work?. You can describe there how models at all levels are trained. You can start by creating the fork of the docs and working on your local copy of docs. When you will be ready, then you will do PR (pull request) and I will review your work (maybe add something). If all will be good then I will deploy a new version of docs to the server.

Tonywhitemin commented 2 years ago

Got it! I will try, thanks!

Tonywhitemin commented 2 years ago

Hi @pplonski I upload the file to link: https://github.com/Tonywhitemin/docs Please help to check if there have any mistake, thanks!

pplonski commented 2 years ago

Thank you @Tonywhitemin! The description is good! I have two coding comments:

Let's name the file ensemble-stacking.md
Paths for images should point to directory in the repository. Please update all images to docs/images/ and use them as
```
![image description](/docs/images/clip_image002.gif)
```
Are you the author of the images?

Tonywhitemin commented 2 years ago

Thanks @pplonski ! I modified the image path and re-upload the markdown file as you mentioned. Please check if it has been fixed, Thanks! By the way, the image there in image folder were from the results I ran and the process flow image is made by myself.

pplonski commented 2 years ago

Thank you @Tonywhitemin!

I've made small fixes in your docs (you can check them here https://github.com/mljar/docs/commit/2ca0391fde8af5ce39ad21ac19a85c8eb9f7ec15 and https://github.com/mljar/docs/commit/e19d68ae5157f88ca9af5435fe51669a18a0e1f6)

Your docs is already in the server https://supervised.mljar.com/features/stacking-ensemble/

Tonywhitemin commented 2 years ago

I'm glad I can put some effort into this nice tool, thank you so much @pplonski!

Tonywhitemin commented 2 years ago

Hi @pplonski I read the section of modes at this link: https://supervised.mljar.com/features/modes/ The "total models tuned for each algorithm" showed as below image.

The numbers here seens reasonable. But in the section of "Custom modes" shows unstacked models number with 10+3x3x2=28... May I ask why it need to be multiply by 2? Thank you!

pplonski commented 2 years ago

For hill climbing it tries to train 2 new models from previous models in each hill climbing step.

Hill climbing algorithm:

select n top algorithms according to the metric value,
for each selected algorithm tries to train 2 new models
it is not assured that there will be 2 new models trained, because the hill climbing might fail in creating a new set of hyper-parameters (but it will try)

Tonywhitemin commented 2 years ago

Got it! But if base on the calculate method, should the numbers showed below be modified?

pplonski commented 2 years ago

You are right! Would you fix this in the docs?

Tonywhitemin commented 2 years ago

Hi @pplonski, I edited the doc and pulled request as below, could you check if it is OK? Thanks!

pplonski commented 2 years ago

it's ok @Tonywhitemin - thank you!

mljar / mljar-supervised

model structure differences between ensemble/stacked/ensemble_stacked #544