biomodhub / biomod2

BIOMOD is a computer platform for ensemble forecasting of species distributions, enabling the treatment of a range of methodological uncertainties in models and the examination of species-environment relationships.
86 stars 22 forks source link

GBM overfitting maybe? #488

Open ShreePoudel0 opened 2 months ago

ShreePoudel0 commented 2 months ago

My GBM score is constantly higher than other models. I think it may be the care of overfitting. image

bm_PlotEvalMean(myBiomodModelOut) $tab name mean1 mean2 sd1 sd2 1 ANN 0.8406 0.5584 0.05359561 0.07750011 2 CTA 0.8279 0.6146 0.14204495 0.26805439 3 FDA 0.8392 0.5544 0.01896664 0.04733850 4 GAM 0.8282 0.5124 0.01815244 0.04083354 5 GBM 0.9675 0.9055 0.07209908 0.15649228 6 GLM 0.8753 0.6172 0.02329545 0.06698225 7 MARS 0.9120 0.7050 0.02977322 0.08474275 8 MAXENT 0.8929 0.6450 0.01960414 0.04600966 9 RF 0.9644 0.8134 0.01585490 0.05394895 10 SRE 0.7014 0.4029 0.02367934 0.04758489

full.name PA run algo metric.eval cutoff sensitivity specificity 1 Pangolin_PA1_RUN1_GBM PA1 RUN1 GBM TSS 523.5 100.000 97.8 2 Pangolin_PA1_RUN1_GBM PA1 RUN1 GBM ROC 522.5 100.000 97.8 3 Pangolin_PA1_RUN1_RF PA1 RUN1 RF TSS 36.0 94.595 90.4 4 Pangolin_PA1_RUN1_RF PA1 RUN1 RF ROC 39.0 94.595 91.4 5 Pangolin_PA1_RUN1_GLM PA1 RUN1 GLM TSS 364.0 97.297 73.8 6 Pangolin_PA1_RUN1_GLM PA1 RUN1 GLM ROC 367.0 97.297 73.8 7 Pangolin_PA1_RUN1_GAM PA1 RUN1 GAM TSS 493.0 81.081 76.6 8 Pangolin_PA1_RUN1_GAM PA1 RUN1 GAM ROC 491.0 81.081 76.6 9 Pangolin_PA1_RUN1_MAXENT PA1 RUN1 MAXENT TSS 378.0 72.973 94.0 10 Pangolin_PA1_RUN1_MAXENT PA1 RUN1 MAXENT ROC 376.5 72.973 94.0 11 Pangolin_PA1_RUN1_SRE PA1 RUN1 SRE TSS 495.0 64.865 84.8 12 Pangolin_PA1_RUN1_SRE PA1 RUN1 SRE ROC 500.0 64.865 84.8 13 Pangolin_PA1_RUN1_ANN PA1 RUN1 ANN TSS 431.0 86.486 72.6 14 Pangolin_PA1_RUN1_ANN PA1 RUN1 ANN ROC 432.5 86.486 73.0 15 Pangolin_PA1_RUN1_CTA PA1 RUN1 CTA TSS 483.5 83.784 83.2 16 Pangolin_PA1_RUN1_CTA PA1 RUN1 CTA ROC 488.0 83.784 83.2 17 Pangolin_PA1_RUN1_FDA PA1 RUN1 FDA TSS 483.0 67.568 90.4 18 Pangolin_PA1_RUN1_FDA PA1 RUN1 FDA ROC 486.0 67.568 90.6 19 Pangolin_PA1_RUN1_MARS PA1 RUN1 MARS TSS 515.0 94.595 86.8 20 Pangolin_PA1_RUN1_MARS PA1 RUN1 MARS ROC 515.0 94.595 86.8 21 Pangolin_PA1_RUN2_GBM PA1 RUN2 GBM TSS 479.0 100.000 98.0 22 Pangolin_PA1_RUN2_GBM PA1 RUN2 GBM ROC 481.0 100.000 98.0 23 Pangolin_PA1_RUN2_RF PA1 RUN2 RF TSS 53.0 92.105 92.0 24 Pangolin_PA1_RUN2_RF PA1 RUN2 RF ROC 55.0 92.105 92.4 25 Pangolin_PA1_RUN2_GLM PA1 RUN2 GLM TSS 696.0 73.684 88.0 26 Pangolin_PA1_RUN2_GLM PA1 RUN2 GLM ROC 701.5 73.684 88.2 27 Pangolin_PA1_RUN2_GAM PA1 RUN2 GAM TSS 408.0 86.842 61.8 28 Pangolin_PA1_RUN2_GAM PA1 RUN2 GAM ROC 412.0 86.842 62.6 29 Pangolin_PA1_RUN2_MAXENT PA1 RUN2 MAXENT TSS 461.0 73.684 93.4 30 Pangolin_PA1_RUN2_MAXENT PA1 RUN2 MAXENT ROC 464.5 73.684 93.6 31 Pangolin_PA1_RUN2_SRE PA1 RUN2 SRE TSS 495.0 65.789 76.4 32 Pangolin_PA1_RUN2_SRE PA1 RUN2 SRE ROC 500.0 65.789 76.4 33 Pangolin_PA1_RUN2_ANN PA1 RUN2 ANN TSS 510.0 78.947 80.6 34 Pangolin_PA1_RUN2_ANN PA1 RUN2 ANN ROC 508.5 78.947 80.6 35 Pangolin_PA1_RUN2_CTA PA1 RUN2 CTA TSS 490.0 86.842 74.4 36 Pangolin_PA1_RUN2_CTA PA1 RUN2 CTA ROC 491.5 86.842 74.4 37 Pangolin_PA1_RUN2_FDA PA1 RUN2 FDA TSS 490.0 76.316 85.6 38 Pangolin_PA1_RUN2_FDA PA1 RUN2 FDA ROC 490.0 76.316 85.6 39 Pangolin_PA1_RUN2_MARS PA1 RUN2 MARS TSS 616.0 86.842 86.6 40 Pangolin_PA1_RUN2_MARS PA1 RUN2 MARS ROC 616.5 86.842 86.8 41 Pangolin_PA1_allRun_GBM PA1 allRun GBM TSS 439.0 98.667 91.8 42 Pangolin_PA1_allRun_GBM PA1 allRun GBM ROC 421.5 100.000 91.1 43 Pangolin_PA1_allRun_RF PA1 allRun RF TSS 28.0 82.667 91.1 44 Pangolin_PA1_allRun_RF PA1 allRun RF ROC 29.0 82.667 92.6 45 Pangolin_PA1_allRun_GLM PA1 allRun GLM TSS 538.0 74.667 78.6 46 Pangolin_PA1_allRun_GLM PA1 allRun GLM ROC 544.0 74.667 79.3 47 Pangolin_PA1_allRun_GAM PA1 allRun GAM TSS 567.0 66.667 82.7 48 Pangolin_PA1_allRun_GAM PA1 allRun GAM ROC 567.5 66.667 82.8 49 Pangolin_PA1_allRun_MAXENT PA1 allRun MAXENT TSS 465.0 66.667 93.0 50 Pangolin_PA1_allRun_MAXENT PA1 allRun MAXENT ROC 473.0 66.667 93.4 51 Pangolin_PA1_allRun_SRE PA1 allRun SRE TSS 495.0 64.000 73.4 52 Pangolin_PA1_allRun_SRE PA1 allRun SRE ROC 500.0 64.000 73.4 53 Pangolin_PA1_allRun_ANN PA1 allRun ANN TSS 561.0 65.333 88.5 54 Pangolin_PA1_allRun_ANN PA1 allRun ANN ROC 434.5 76.000 79.4 55 Pangolin_PA1_allRun_CTA PA1 allRun CTA TSS 606.0 93.333 88.8 56 Pangolin_PA1_allRun_CTA PA1 allRun CTA ROC 608.5 93.333 88.8 57 Pangolin_PA1_allRun_FDA PA1 allRun FDA TSS 516.0 65.333 90.2 58 Pangolin_PA1_allRun_FDA PA1 allRun FDA ROC 498.5 66.667 89.5 59 Pangolin_PA1_allRun_MARS PA1 allRun MARS TSS 442.0 89.333 74.9 60 Pangolin_PA1_allRun_MARS PA1 allRun MARS ROC 446.0 89.333 75.3 61 Pangolin_PA2_RUN1_GBM PA2 RUN1 GBM TSS 453.0 100.000 97.4 62 Pangolin_PA2_RUN1_GBM PA2 RUN1 GBM ROC 468.0 100.000 97.6 63 Pangolin_PA2_RUN1_RF PA2 RUN1 RF TSS 59.0 89.189 94.0 64 Pangolin_PA2_RUN1_RF PA2 RUN1 RF ROC 59.0 89.189 94.0 65 Pangolin_PA2_RUN1_GLM PA2 RUN1 GLM TSS 589.0 83.784 85.4 66 Pangolin_PA2_RUN1_GLM PA2 RUN1 GLM ROC 596.5 83.784 86.0 67 Pangolin_PA2_RUN1_GAM PA2 RUN1 GAM TSS 344.0 91.892 58.6 68 Pangolin_PA2_RUN1_GAM PA2 RUN1 GAM ROC 344.0 91.892 58.6 69 Pangolin_PA2_RUN1_MAXENT PA2 RUN1 MAXENT TSS 330.0 72.973 91.2 70 Pangolin_PA2_RUN1_MAXENT PA2 RUN1 MAXENT ROC 335.5 72.973 91.4 71 Pangolin_PA2_RUN1_SRE PA2 RUN1 SRE TSS 495.0 62.162 82.2 72 Pangolin_PA2_RUN1_SRE PA2 RUN1 SRE ROC 500.0 62.162 82.2 73 Pangolin_PA2_RUN1_ANN PA2 RUN1 ANN TSS 629.0 75.676 84.8 74 Pangolin_PA2_RUN1_ANN PA2 RUN1 ANN ROC 535.5 81.081 79.6 75 Pangolin_PA2_RUN1_CTA PA2 RUN1 CTA TSS 550.0 72.973 86.4 76 Pangolin_PA2_RUN1_CTA PA2 RUN1 CTA ROC 553.5 72.973 86.4 77 Pangolin_PA2_RUN1_FDA PA2 RUN1 FDA TSS 436.0 75.676 81.6 78 Pangolin_PA2_RUN1_FDA PA2 RUN1 FDA ROC 442.0 75.676 82.6 79 Pangolin_PA2_RUN1_MARS PA2 RUN1 MARS TSS 292.0 100.000 77.0 80 Pangolin_PA2_RUN1_MARS PA2 RUN1 MARS ROC 296.0 100.000 77.0 81 Pangolin_PA2_RUN2_GBM PA2 RUN2 GBM TSS 486.5 100.000 97.2 82 Pangolin_PA2_RUN2_GBM PA2 RUN2 GBM ROC 484.5 100.000 97.2 83 Pangolin_PA2_RUN2_RF PA2 RUN2 RF TSS 26.0 97.368 87.6 84 Pangolin_PA2_RUN2_RF PA2 RUN2 RF ROC 27.0 97.368 88.4 85 Pangolin_PA2_RUN2_GLM PA2 RUN2 GLM TSS 322.0 100.000 64.8 86 Pangolin_PA2_RUN2_GLM PA2 RUN2 GLM ROC 326.0 100.000 65.8 87 Pangolin_PA2_RUN2_GAM PA2 RUN2 GAM TSS 472.0 81.579 70.4 88 Pangolin_PA2_RUN2_GAM PA2 RUN2 GAM ROC 478.5 81.579 71.8 89 Pangolin_PA2_RUN2_MAXENT PA2 RUN2 MAXENT TSS 201.0 89.474 73.2 90 Pangolin_PA2_RUN2_MAXENT PA2 RUN2 MAXENT ROC 207.5 89.474 73.6 calibration validation evaluation 1 0.978 0.328 NA 2 0.997 0.769 NA 3 0.852 0.377 NA 4 0.981 0.767 NA 5 0.711 0.443 NA 6 0.913 0.761 NA 7 0.577 0.380 NA 8 0.863 0.720 NA 9 0.670 0.384 NA 10 0.918 0.718 NA 11 0.497 0.324 NA 12 0.748 0.662 NA 13 0.593 0.396 NA 14 0.847 0.733 NA 15 0.670 0.328 NA 16 0.878 0.682 NA 17 0.580 0.378 NA 18 0.873 0.760 NA 19 0.814 0.447 NA 20 0.949 0.801 NA 21 0.980 0.394 NA 22 0.996 0.790 NA 23 0.841 0.463 NA 24 0.974 0.788 NA 25 0.617 0.378 NA 26 0.871 0.779 NA 27 0.488 0.341 NA 28 0.809 0.759 NA 29 0.671 0.425 NA 30 0.888 0.796 NA 31 0.422 0.190 NA 32 0.711 0.595 NA 33 0.595 0.350 NA 34 0.828 0.705 NA 35 0.612 0.240 NA 36 0.818 0.652 NA 37 0.619 0.493 NA 38 0.838 0.775 NA 39 0.736 0.424 NA 40 0.923 0.765 NA 41 0.905 NA NA 42 0.983 NA NA 43 0.753 NA NA 44 0.946 NA NA 45 0.534 NA NA 46 0.852 NA NA 47 0.495 NA NA 48 0.813 NA NA 49 0.597 NA NA 50 0.872 NA NA 51 0.374 NA NA 52 0.687 NA NA 53 0.539 NA NA 54 0.857 NA NA 55 0.821 NA NA 56 0.951 NA NA 57 0.555 NA NA 58 0.831 NA NA 59 0.644 NA NA 60 0.896 NA NA 61 0.974 0.326 NA 62 0.994 0.742 NA 63 0.832 0.376 NA 64 0.965 0.796 NA 65 0.692 0.247 NA 66 0.903 0.696 NA 67 0.505 0.218 NA 68 0.834 0.700 NA 69 0.642 0.408 NA 70 0.897 0.805 NA 71 0.444 0.143 NA 72 0.722 0.572 NA 73 0.605 0.152 NA 74 0.868 0.681 NA 75 0.594 0.300 NA 76 0.818 0.660 NA 77 0.573 0.357 NA 78 0.842 0.761 NA 79 0.770 0.372 NA 80 0.933 0.763 NA 81 0.972 0.508 NA 82 0.992 0.829 NA 83 0.858 0.496 NA 84 0.979 0.785 NA 85 0.648 0.454 NA 86 0.886 0.798 NA 87 0.520 0.349 NA 88 0.835 0.771 NA 89 0.627 0.453 NA 90 0.887 0.803 NA

HeleneBlt commented 2 months ago

Hello Shree !

If you compare calibration scores and validation scores, indeed there is some overfitting. That is also the case with other algorithms.

Hélène 🌵

ShreePoudel0 commented 2 months ago

Hello. Thanks for the reply. How do we identify the overfitting problem and how can we actually solve them. Is there any specific functions for this problems?

On Mon, Aug 5, 2024 at 2:22 PM HBlancheteau @.***> wrote:

Hello Shree !

If you compare calibration scores and validation scores, indeed there is some overfitting. That is also the case with other algorithms.

Hélène 🌵

— Reply to this email directly, view it on GitHub https://github.com/biomodhub/biomod2/issues/488#issuecomment-2268491290, or unsubscribe https://github.com/notifications/unsubscribe-auth/BBPZRFW32IEVQO7Z3E7IWM3ZP42T5AVCNFSM6AAAAABL3W4RXWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENRYGQ4TCMRZGA . You are receiving this because you authored the thread.Message ID: @.***>

HeleneBlt commented 2 months ago

Hello Shree !

As you already remarked, the comparison of the calibration and the validation scores between the different algorithms is a good way to detect overfitting. To avoid it, you may play with the different strategies of cross-validation, the ratio between calibration/validation. You can also change the options of your algorithms. You can find more information about which parameters you need to change in the documentation of the packages that we use (see ModelsTable). And you can find how to change the options in the vignette Options.

Hope it helps ! Hélène