VowpalWabbit / vowpal_wabbit

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.
https://vowpalwabbit.org
Other
8.49k stars 1.92k forks source link

VWRegressor provides very different performance for loss_function = 'quantile' , quantile_tau = 0.5 and loss_function = 'squared' #3488

Closed Sandy4321 closed 2 years ago

Sandy4321 commented 2 years ago

Describe the bug

VWRegressor provides very different performance for loss_function = 'quantile' , quantile_tau = 0.5 and loss_function = 'squared'

loss_function = 'squared' - provides very GOOD low MAE loss_function = 'quantile' , quantile_tau = 0.5 - provides bad high MAE

data is mixture categorical data and continues data : 600 rows like this

image

To Reproduce

        if 1:
            model = VWRegressor(convert_to_vw = False ,normalized = True, 
                                                           passes = passes, 
                                                            power_t = 0.5, #1.0,
                                                           readable_model = 'my_VW.model' , cache_file =  'my_VW.cache' ,
                                                           learning_rate = 2.3 , l2 = l2, l1=l1,
                                                           quadratic= 'CC' , cubic = 'CCC',
                                                            loss_function = 'quantile' , quantile_tau = 0.5)
            q=0
        else:
            model = VWRegressor(convert_to_vw = False ,normalized = True, 
                                                      passes = passes, 
                                                       power_t = 0.5, #1.0,
                                                      readable_model = 'my_VW.model' , cache_file =  'my_VW.cache' ,
                                                      learning_rate = 2.1, loss_function = 'squared' , l2 = l2, l1=l1,
                                                      quadratic= 'CC' , cubic = 'CCC' )

Expected behavior

my guess MAE for loss_function = 'quantile' , quantile_tau = 0.5 and loss_function = 'squared' should be very similar

in addition loss_function = 'quantile' , quantile_tau = 0.9 and loss_function = 'quantile' , quantile_tau = 0.1 gives very wide confidence intervals - even no sense confidence intervals

Observed Behavior

How did VW behave? Please include any stack trace, log messages or crash logs.

Environment

What version of VW did you use? latest OS - windows 10

Additional context

do you have code example where VWRegressor loss_function = 'quantile' , quantile_tau = 0.9 and loss_function = 'quantile' , quantile_tau = 0.1

Sandy4321 commented 2 years ago

data

Sandy4321 commented 2 years ago

data

Sandy4321 commented 2 years ago

A B C D E F G target a11 b45 c21 d3 e39 0.9 89 340 a9 b47 c5 d3 e11 10 15 12 a9 b32 c101 d3 e11 443 34 30 a8 b47 c4 d8 e11 22 16 122 a9 b48 c4 d8 e11 18 17 122 a7 b32 c4 d3 e79 19 -8.7 30 a11 b46 c4 d8 e11 20 32 345 a11 b45 c100 d13 e21 0.9 -5.871428571 330 a11 b32 c6 d3 e1 843 34 30 a6 b34 c4 d8 e11 22 -1.228571429 122 a12 b46 c4 d18 e1 43.1 35 67.33333333 a16 b33 c102 d13 e9 118 3.455064935 171.3333333 a14 b33 c5 d18 e9 64.2 36 275.3333333 a13 b47 c100 d13 e19 85.3 7.874165834 379.3333333 a11 b34 c451 d23 e1 1000 37 483.3333333 a11 b33 c6 d28 e29 106.4 11.95932334 587.3333333 a14 b48 c100 d18 e39 127.5 38 691.3333333 a13 b35 c451 d33 e39 463 15.71684461 795.3333333 a14 b47 c7 d38 e49 148.6 39 899.3333333 a15 b49 c451 d53 e59 169.7 19.17643059 1003.333333 a13 b36 c102 d43 e69 791 40 1107.333333 a15 b33 c8 d48 e69 190.8 22.37279412 1211.333333 a16 b50 c5 d13 e79 211.9 41 1315.333333 a8 b37 c21 d53 e69 252 25.33916073 1419.333333 a12 b33 c101 d58 e89 233 25.39367549 1523.333333 a11 b33 c51 d38 e21 249.3573394 42 1627.333333 a6 b48 c5 d63 e39 263.2949541 28.04913728 1731.333333 a13 b46 c451 d28 e11 277.2325688 43 1835.333333 a8 b34 c100 d13 e1 291.1701835 30.54894364 1939.333333 a9 b32 c6 d18 e49 305.1077982 44 2043.333333 a11 b45 c21 d48 e9 0.9 89 340 a6 b33 c21 d38 e1 846 40.5387825 12 a8 b32 c8 d3 e11 443 34 30 a15 b35 c4 d8 e11 22 41.11596309 122 a14 b48 c4 d8 e11 741 59 122 a8 b32 c102 d3 e21 743 42.49636402 30 a11 b32 c8 d8 e11 383 32 345 a11 b45 c6 d23 e21 0.9 42.66740662 330 a11 b32 c21 d3 e69 51 34 30 a11 b49 c4 d8 e11 22 60 122 a12 b46 c100 d53 e1 43.1 35 67.33333333 a7 b33 c21 d13 e39 873 70 171.3333333 a13 b45 c5 d18 e9 64.2 36 275.3333333 a13 b47 c100 d3 e19 85.3 80 379.3333333 a11 b34 c101 d23 e11 116 37 483.3333333 a6 b48 c6 d28 e29 106.4 90 587.3333333 a14 b48 c21 d33 e39 127.5 38 691.3333333 a14 b35 c4 d33 e39 306 55.17326637 795.3333333 a9 b49 c7 d38 e49 148.6 39 899.3333333 a15 b49 c451 d38 e59 169.7 55.42823691 1003.333333 a8 b36 c451 d43 e79 975 40 1107.333333 a9 b47 c8 d48 e69 190.8 22 1211.333333 a16 b50 c102 d3 e79 211.9 41 1315.333333 a14 b37 c101 d53 e9 615 25 1419.333333 a12 b46 c7 d58 e89 233 71 1523.333333 a16 b33 c51 d8 e21 249.3573394 42 1627.333333 a7 b33 c5 d63 e11 263.2949541 79 1731.333333 a13 b46 c102 d38 e11 277.2325688 43 1835.333333 a8 b34 c100 d13 e1 291.1701835 2 1939.333333 a14 b49 c6 d18 e79 305.1077982 44 2043.333333 a11 b45 c21 d33 e69 0.9 89 340 a9 b49 c6 d28 e39 562 23 12 a13 b32 c451 d3 e79 443 34 30 a16 b33 c4 d8 e11 22 83 122 a8 b33 c4 d8 e11 520 24 122 a13 b32 c451 d3 e19 403 9 30 a11 b45 c102 d8 e11 913 32 345 a11 b45 c21 d23 e21 0.9 19 330 a13 b32 c102 d3 e21 308 34 30 a12 b35 c4 d8 e11 22 21 122 a12 b46 c101 d18 e1 43.1 35 67.33333333 a12 b33 c102 d13 e21 550 32 171.3333333 a6 b48 c5 d18 e9 64.2 36 275.3333333 a13 b47 c7 d23 e19 85.3 76 379.3333333 a13 b34 c6 d23 e19 180 37 483.3333333 a16 b47 c6 d28 e29 106.4 93 587.3333333 a14 b48 c21 d23 e39 127.5 38 691.3333333 a15 b35 c451 d33 e1 181 80 795.3333333 a9 b45 c7 d38 e49 148.6 39 899.3333333 a15 b49 c7 d8 e59 169.7 3 1003.333333 a7 b36 c5 d43 e29 431 40 1107.333333 a15 b37 c8 d48 e69 190.8 19 1211.333333 a16 b50 c8 d8 e79 211.9 41 1315.333333 a15 b37 c8 d53 e29 187 26 1419.333333 a12 b33 c8 d58 e89 233 3 1523.333333 a13 b33 c51 d28 e21 249.3573394 42 1627.333333 a7 b37 c5 d63 e69 263.2949541 71 1731.333333 a13 b46 c100 d38 e11 277.2325688 43 1835.333333 a12 b34 c100 d13 e1 291.1701835 1 1939.333333 a7 b49 c6 d18 e39 305.1077982 44 2043.333333 a11 b45 c21 d38 e49 0.9 89 340 a16 b46 c451 d38 e21 782 34 12 a7 b32 c7 d3 e39 443 34 30 a14 b49 c4 d8 e11 22 43 122 a9 b37 c4 d8 e11 727 71 122 a6 b32 c8 d3 e9 873 53 30 a11 b46 c21 d8 e11 400 32 345 a11 b45 c7 d33 e21 0.9 41 330 a15 b32 c6 d3 e39 138 34 30 a13 b48 c4 d8 e11 22 68 122 a12 b46 c4 d18 e1 43.1 35 67.33333333 a11 b33 c101 d13 e21 100 11 171.3333333 a7 b33 c5 d18 e9 64.2 36 275.3333333 a13 b47 c8 d38 e19 85.3 21 379.3333333 a9 b34 c451 d23 e79 431 37 483.3333333 a11 b45 c6 d28 e29 106.4 80 587.3333333 a14 b48 c4 d53 e39 127.5 38 691.3333333 a6 b35 c8 d33 e29 60 5 795.3333333 a16 b46 c7 d38 e49 148.6 39 899.3333333 a15 b49 c100 d33 e59 169.7 96 1003.333333 a14 b36 c8 d43 e1 441 40 1107.333333 a14 b33 c8 d48 e69 190.8 77 1211.333333 a16 b50 c8 d8 e79 211.9 41 1315.333333 a6 b37 c451 d53 e69 914 99 1419.333333 a12 b49 c7 d58 e89 233 58 1523.333333 a12 b33 c51 d28 e21 249.3573394 42 1627.333333 a11 b32 c5 d63 e69 263.2949541 97 1731.333333 a13 b46 c8 d23 e11 277.2325688 43 1835.333333 a6 b34 c100 d13 e1 291.1701835 93 1939.333333 a16 b33 c6 d18 e49 305.1077982 44 2043.333333 a11 b45 c21 d23 e79 0.9 89 340 a12 b33 c5 d13 e29 209 90 12 a8 b32 c100 d3 e79 443 34 30 a8 b45 c4 d8 e11 22 86 122 a11 b49 c4 d8 e11 297 85 122 a11 b32 c21 d3 e39 511 48 30 a11 b33 c7 d8 e11 658 32 345 a11 b45 c7 d48 e21 0.9 60 330 a12 b32 c8 d3 e79 657 34 30 a15 b34 c4 d8 e11 22 91 122 a12 b46 c100 d8 e1 43.1 35 67.33333333 a9 b33 c21 d13 e49 825 42 171.3333333 a6 b48 c5 d18 e9 64.2 36 275.3333333 a13 b47 c451 d18 e19 85.3 54 379.3333333 a13 b34 c5 d23 e11 997 37 483.3333333 a15 b37 c6 d28 e29 106.4 41 587.3333333 a14 b48 c5 d8 e39 127.5 38 691.3333333 a13 b35 c100 d33 e69 967 44 795.3333333 a16 b46 c7 d38 e49 148.6 39 899.3333333 a15 b49 c21 d13 e59 169.7 55 1003.333333 a9 b36 c21 d43 e79 834 40 1107.333333 a7 b45 c8 d48 e69 190.8 61 1211.333333 a16 b50 c5 d38 e79 211.9 41 1315.333333 a14 b37 c21 d53 e9 883 74 1419.333333 a12 b48 c7 d58 e89 233 14 1523.333333 a6 b33 c51 d33 e21 249.3573394 42 1627.333333 a13 b49 c5 d63 e21 263.2949541 67 1731.333333 a13 b46 c100 d13 e11 277.2325688 43 1835.333333 a16 b34 c7 d13 e1 291.1701835 63 1939.333333 a8 b45 c6 d18 e39 305.1077982 44 2043.333333 a11 b45 c21 d18 e49 0.9 89 340 a8 b35 c100 d3 e69 567 88 12 a9 b32 c451 d3 e69 443 34 30 a11 b37 c4 d8 e11 22 69 122 a16 b48 c4 d8 e11 1000 1 122 a12 b32 c451 d3 e49 750 100 30 a11 b34 c5 d8 e11 792 32 345 a11 b45 c451 d48 e21 0.9 20 330 a8 b32 c7 d3 e1 299 34 30 a6 b33 c4 d8 e11 22 71 122 a12 b46 c7 d8 e1 43.1 35 67.33333333 a11 b33 c100 d13 e39 441 96 171.3333333 a16 b32 c5 d18 e9 64.2 36 275.3333333 a13 b47 c4 d38 e19 85.3 23 379.3333333 a16 b34 c21 d23 e1 310 37 483.3333333 a16 b33 c6 d28 e29 106.4 17 587.3333333 a14 b48 c451 d38 e39 127.5 38 691.3333333 a7 b35 c4 d33 e19 405 33 795.3333333 a13 b45 c7 d38 e49 148.6 39 899.3333333 a15 b49 c5 d28 e59 169.7 53 1003.333333 a12 b36 c21 d43 e19 60 40 1107.333333 a16 b49 c8 d48 e69 190.8 26 1211.333333 a16 b50 c102 d13 e79 211.9 41 1315.333333 a11 b37 c6 d53 e21 926 62 1419.333333 a12 b47 c8 d58 e89 233 5 1523.333333 a13 b33 c51 d38 e21 249.3573394 42 1627.333333 a13 b48 c5 d63 e1 263.2949541 53 1731.333333 a13 b46 c100 d28 e11 277.2325688 43 1835.333333 a7 b34 c21 d13 e1 291.1701835 46 1939.333333 a13 b35 c6 d18 e79 305.1077982 44 2043.333333 a11 b45 c21 d3 e11 0.9 89 340 a13 b34 c5 d23 e49 269 65 12 a7 b32 c5 d3 e29 443 34 30 a14 b34 c4 d8 e11 22 90 122 a13 b35 c4 d8 e11 545 77 122 a13 b32 c451 d3 e11 992 9 30 a11 b48 c21 d8 e11 601 32 345 a11 b45 c6 d53 e21 0.9 48 330 a14 b32 c451 d3 e1 415 34 30 a16 b45 c4 d8 e11 22 31 122 a12 b46 c21 d53 e1 43.1 35 67.33333333 a11 b33 c8 d13 e9 676 54 171.3333333 a16 b49 c5 d18 e9 64.2 36 275.3333333 a13 b47 c21 d33 e19 85.3 99 379.3333333 a6 b34 c7 d23 e29 969 37 483.3333333 a15 b49 c6 d28 e29 106.4 14 587.3333333 a14 b48 c6 d18 e39 127.5 38 691.3333333 a15 b35 c100 d33 e49 676 61 795.3333333 a16 b34 c7 d38 e49 148.6 39 899.3333333 a15 b49 c8 d33 e59 169.7 27 1003.333333 a11 b36 c5 d43 e49 849 40 1107.333333 a7 b35 c8 d48 e69 190.8 59 1211.333333 a16 b50 c100 d38 e79 211.9 41 1315.333333 a9 b37 c4 d53 e39 157 90 1419.333333 a12 b47 c100 d58 e89 233 46 1523.333333 a11 b33 c51 d8 e21 249.3573394 42 1627.333333 a11 b45 c5 d63 e39 263.2949541 42 1731.333333 a13 b46 c6 d48 e11 277.2325688 43 1835.333333 a7 b34 c5 d13 e1 291.1701835 10 1939.333333 a16 b33 c6 d18 e49 305.1077982 44 2043.333333 a11 b45 c21 d38 e29 0.9 89 340 a11 b46 c102 d23 e29 775 15 12 a6 b32 c6 d3 e49 443 34 30 a9 b33 c4 d8 e11 22 85 122 a12 b47 c4 d8 e11 389 96 122 a9 b32 c5 d3 e29 537 93 30 a11 b46 c5 d8 e11 566 32 345 a11 b45 c4 d3 e21 0.9 96 330 a11 b32 c5 d3 e21 314 34 30 a15 b32 c4 d8 e11 22 59 122 a12 b46 c5 d28 e1 43.1 35 67.33333333 a7 b33 c102 d13 e11 519 55 171.3333333 a9 b48 c5 d18 e9 64.2 36 275.3333333 a13 b47 c21 d38 e19 85.3 51 379.3333333 a7 b34 c451 d23 e69 279 37 483.3333333 a16 b46 c6 d28 e29 106.4 13 587.3333333 a14 b48 c451 d3 e39 127.5 38 691.3333333 a7 b35 c6 d33 e49 329 52 795.3333333 a8 b48 c7 d38 e49 148.6 39 899.3333333 a15 b49 c5 d28 e59 169.7 49 1003.333333 a14 b36 c102 d43 e9 401 40 1107.333333 a6 b47 c8 d48 e69 190.8 1 1211.333333 a16 b50 c4 d48 e79 211.9 41 1315.333333 a16 b37 c6 d53 e49 66 36 1419.333333 a12 b37 c4 d58 e89 233 81 1523.333333 a15 b33 c51 d28 e21 249.3573394 42 1627.333333 a13 b48 c5 d63 e29 263.2949541 93 1731.333333 a13 b46 c21 d8 e11 277.2325688 43 1835.333333 a7 b34 c21 d13 e1 291.1701835 24 1939.333333 a15 b35 c6 d18 e69 305.1077982 44 2043.333333 a11 b45 c21 d53 e69 0.9 89 340 a11 b45 c451 d13 e21 769 74 12 a8 b32 c8 d3 e79 443 34 30 a9 b33 c4 d8 e11 22 87 122 a16 b35 c4 d8 e11 256 69 122 a7 b32 c100 d3 e29 968 80 30 a11 b45 c21 d8 e11 651 32 345 a11 b45 c451 d28 e21 0.9 81 330 a7 b32 c6 d3 e21 605 34 30 a9 b48 c4 d8 e11 22 28 122 a12 b46 c6 d8 e1 43.1 35 67.33333333 a12 b33 c4 d13 e39 379 67 171.3333333 a6 b35 c5 d18 e9 64.2 36 275.3333333 a13 b47 c101 d48 e19 85.3 25 379.3333333 a6 b34 c4 d23 e1 796 37 483.3333333 a7 b33 c6 d28 e29 106.4 20 587.3333333 a14 b48 c100 d33 e39 127.5 38 691.3333333 a8 b35 c102 d33 e19 924 85 795.3333333 a6 b34 c7 d38 e49 148.6 39 899.3333333 a15 b49 c4 d53 e59 169.7 48 1003.333333 a11 b36 c101 d43 e39 307 40 1107.333333 a14 b46 c8 d48 e69 190.8 75 1211.333333 a16 b50 c102 d33 e79 211.9 41 1315.333333 a11 b37 c6 d53 e11 877 98 1419.333333 a12 b47 c100 d58 e89 233 56 1523.333333 a11 b33 c51 d28 e21 249.3573394 42 1627.333333 a12 b34 c5 d63 e79 263.2949541 90 1731.333333 a13 b46 c21 d53 e11 277.2325688 43 1835.333333 a13 b34 c4 d13 e1 291.1701835 19 1939.333333 a14 b35 c6 d18 e19 305.1077982 44 2043.333333 a11 b45 c21 d53 e19 0.9 89 340 a6 b33 c8 d33 e1 156 27 12 a13 b32 c6 d3 e21 443 34 30 a8 b32 c4 d8 e11 22 12 122 a6 b47 c4 d8 e11 278 88 122 a15 b32 c21 d3 e29 609 43 30 a11 b32 c102 d8 e11 203 32 345 a11 b45 c21 d23 e21 0.9 16 330 a9 b32 c5 d3 e21 630 34 30 a15 b34 c4 d8 e11 22 90 122 a12 b46 c6 d18 e1 43.1 35 67.33333333 a6 b33 c6 d13 e29 868 81 171.3333333 a8 b32 c5 d18 e9 64.2 36 275.3333333 a13 b47 c4 d53 e19 85.3 57 379.3333333 a8 b34 c4 d23 e1 679 37 483.3333333 a7 b37 c6 d28 e29 106.4 84 587.3333333 a14 b48 c5 d38 e39 127.5 38 691.3333333 a6 b35 c21 d33 e19 890 64 795.3333333 a8 b35 c7 d38 e49 148.6 39 899.3333333 a15 b49 c4 d8 e59 169.7 71 1003.333333 a7 b36 c8 d43 e79 134 40 1107.333333 a16 b49 c8 d48 e69 190.8 96 1211.333333 a16 b50 c102 d53 e79 211.9 41 1315.333333 a8 b37 c101 d53 e9 980 73 1419.333333 a12 b49 c7 d58 e89 233 66 1523.333333 a6 b33 c51 d33 e21 249.3573394 42 1627.333333 a15 b48 c5 d63 e29 263.2949541 35 1731.333333 a13 b46 c5 d23 e11 277.2325688 43 1835.333333 a9 b34 c4 d13 e1 291.1701835 23 1939.333333 a12 b37 c6 d18 e1 305.1077982 44 2043.333333 a11 b45 c21 d38 e11 0.9 89 340 a14 b34 c21 d13 e79 690 74 12 a15 b32 c8 d3 e49 443 34 30 a16 b35 c4 d8 e11 22 62 122 a16 b33 c4 d8 e11 976 94 122 a9 b32 c4 d3 e1 912 41 30 a11 b33 c6 d8 e11 825 32 345 a11 b45 c451 d38 e21 0.9 13 330 a15 b32 c8 d3 e39 342 34 30 a9 b48 c4 d8 e11 22 97 122 a12 b46 c4 d8 e1 43.1 35 67.33333333 a9 b33 c102 d13 e69 948 10 171.3333333 a15 b49 c5 d18 e9 64.2 36 275.3333333 a13 b47 c451 d28 e19 85.3 61 379.3333333 a16 b34 c100 d23 e19 282 37 483.3333333 a15 b46 c6 d28 e29 106.4 23 587.3333333 a14 b48 c101 d53 e39 127.5 38 691.3333333 a12 b35 c451 d33 e79 735 9 795.3333333 a15 b32 c7 d38 e49 148.6 39 899.3333333 a15 b49 c100 d3 e59 169.7 70 1003.333333 a11 b36 c6 d43 e49 156 40 1107.333333 a13 b33 c8 d48 e69 190.8 48 1211.333333 a16 b50 c8 d8 e79 211.9 41 1315.333333 a8 b37 c102 d53 e39 750 37 1419.333333 a12 b37 c21 d58 e89 233 61 1523.333333 a16 b33 c51 d23 e21 249.3573394 42 1627.333333 a12 b49 c5 d63 e49 263.2949541 24 1731.333333 a13 b46 c451 d53 e11 277.2325688 43 1835.333333 a12 b34 c7 d13 e1 291.1701835 64 1939.333333 a7 b49 c6 d18 e1 305.1077982 44 2043.333333 a11 b45 c21 d18 e11 0.9 89 340 a14 b48 c451 d18 e19 603 26 12 a15 b32 c100 d3 e79 443 34 30 a15 b47 c4 d8 e11 22 78 122 a9 b37 c4 d8 e11 418 15 122 a8 b32 c100 d3 e9 908 4 30 a11 b37 c451 d8 e11 163 32 345 a11 b45 c7 d28 e21 0.9 25 330 a6 b32 c4 d3 e9 799 34 30 a9 b48 c4 d8 e11 22 55 122 a12 b46 c102 d48 e1 43.1 35 67.33333333 a16 b33 c4 d13 e69 444 72 171.3333333 a14 b46 c5 d18 e9 64.2 36 275.3333333 a13 b47 c6 d33 e19 85.3 18 379.3333333 a6 b34 c101 d23 e49 702 37 483.3333333 a6 b45 c6 d28 e29 106.4 10 587.3333333 a14 b48 c8 d13 e39 127.5 38 691.3333333 a7 b35 c8 d33 e19 271 24 795.3333333 a14 b49 c7 d38 e49 148.6 39 899.3333333 a15 b49 c7 d3 e59 169.7 56 1003.333333 a8 b36 c21 d43 e21 993 40 1107.333333 a13 b34 c8 d48 e69 190.8 8 1211.333333 a16 b50 c100 d23 e79 211.9 41 1315.333333 a12 b37 c21 d53 e9 677 86 1419.333333 a12 b33 c100 d58 e89 233 69 1523.333333 a7 b33 c51 d38 e21 249.3573394 42 1627.333333 a15 b48 c5 d63 e11 263.2949541 18 1731.333333 a13 b46 c8 d23 e11 277.2325688 43 1835.333333 a16 b34 c21 d13 e1 291.1701835 31 1939.333333 a15 b32 c6 d18 e49 305.1077982 44 2043.333333 a11 b45 c21 d13 e39 0.9 89 340 a9 b35 c7 d13 e49 219 58 12 a12 b32 c451 d3 e79 443 34 30 a11 b47 c4 d8 e11 22 30 122 a15 b35 c4 d8 e11 273 7 122 a15 b32 c451 d3 e69 666 18 30 a11 b48 c451 d8 e11 858 32 345 a11 b45 c8 d3 e21 0.9 25 330 a13 b32 c5 d3 e49 7 34 30 a9 b45 c4 d8 e11 22 24 122 a12 b46 c102 d23 e1 43.1 35 67.33333333 a12 b33 c100 d13 e11 733 74 171.3333333 a6 b34 c5 d18 e9 64.2 36 275.3333333 a13 b47 c101 d13 e19 85.3 98 379.3333333 a15 b34 c102 d23 e19 440 37 483.3333333 a7 b46 c6 d28 e29 106.4 86 587.3333333 a14 b48 c102 d3 e39 127.5 38 691.3333333 a7 b35 c4 d33 e29 677 56 795.3333333 a9 b49 c7 d38 e49 148.6 39 899.3333333 a15 b49 c5 d3 e59 169.7 96 1003.333333 a14 b36 c7 d43 e69 636 40 1107.333333 a8 b47 c8 d48 e69 190.8 48 1211.333333 a16 b50 c451 d38 e79 211.9 41 1315.333333 a14 b37 c101 d53 e11 289 74 1419.333333 a12 b32 c101 d58 e89 233 56 1523.333333 a9 b33 c51 d3 e21 249.3573394 42 1627.333333 a9 b34 c5 d63 e79 263.2949541 14 1731.333333 a13 b46 c100 d33 e11 277.2325688 43 1835.333333 a6 b34 c451 d13 e1 291.1701835 70 1939.333333 a16 b48 c6 d18 e21 305.1077982 44 2043.333333 a11 b45 c21 d33 e21 0.9 89 340 a13 b46 c4 d8 e21 175 21 12 a11 b32 c4 d3 e49 443 34 30 a13 b37 c4 d8 e11 22 91 122 a8 b35 c4 d8 e11 838 26 122 a9 b32 c8 d3 e79 469 48 30 a11 b33 c7 d8 e11 568 32 345 a11 b45 c6 d18 e21 0.9 34 330 a12 b32 c6 d3 e1 765 34 30 a7 b37 c4 d8 e11 22 18 122 a12 b46 c102 d23 e1 43.1 35 67.33333333 a6 b33 c101 d13 e69 720 4 171.3333333 a15 b46 c5 d18 e9 64.2 36 275.3333333 a13 b47 c21 d3 e19 85.3 56 379.3333333 a6 b34 c5 d23 e29 72 37 483.3333333 a16 b32 c6 d28 e29 106.4 95 587.3333333 a14 b48 c102 d48 e39 127.5 38 691.3333333 a6 b35 c21 d33 e39 402 37 795.3333333 a8 b47 c7 d38 e49 148.6 39 899.3333333 a15 b49 c7 d28 e59 169.7 16 1003.333333 a9 b36 c4 d43 e79 990 40 1107.333333 a16 b46 c8 d48 e69 190.8 6 1211.333333 a16 b50 c451 d23 e79 211.9 41 1315.333333 a7 b37 c451 d53 e49 128 30 1419.333333 a12 b35 c101 d58 e89 233 43 1523.333333 a14 b33 c51 d33 e21 249.3573394 42 1627.333333 a11 b32 c5 d63 e19 263.2949541 6 1731.333333 a13 b46 c7 d23 e11 277.2325688 43 1835.333333 a14 b34 c5 d13 e1 291.1701835 36 1939.333333 a8 b49 c6 d18 e69 305.1077982 44 2043.333333 a11 b45 c21 d23 e11 0.9 89 340 a11 b35 c4 d8 e11 241 66 12 a7 b32 c100 d3 e29 443 34 30 a12 b46 c4 d8 e11 22 99 122 a9 b32 c4 d8 e11 943 24 122 a6 b32 c100 d3 e29 594 15 30 a11 b35 c6 d8 e11 669 32 345 a11 b45 c8 d48 e21 0.9 53 330 a7 b32 c6 d3 e11 959 34 30 a14 b32 c4 d8 e11 22 16 122 a12 b46 c7 d18 e1 43.1 35 67.33333333 a11 b33 c102 d13 e79 928 42 171.3333333 a16 b49 c5 d18 e9 64.2 36 275.3333333 a13 b47 c102 d8 e19 85.3 97 379.3333333 a14 b34 c102 d23 e1 496 37 483.3333333 a6 b32 c6 d28 e29 106.4 23 587.3333333 a14 b48 c6 d28 e39 127.5 38 691.3333333 a11 b35 c101 d33 e19 861 43 795.3333333 a9 b49 c7 d38 e49 148.6 39 899.3333333 a15 b49 c6 d48 e59 169.7 27 1003.333333 a11 b36 c451 d43 e1 696 40 1107.333333 a14 b48 c8 d48 e69 190.8 93 1211.333333 a16 b50 c7 d48 e79 211.9 41 1315.333333 a14 b37 c101 d53 e69 977 8 1419.333333 a12 b35 c101 d58 e89 233 9 1523.333333 a14 b33 c51 d3 e21 249.3573394 42 1627.333333 a8 b45 c5 d63 e39 263.2949541 33 1731.333333 a13 b46 c100 d8 e11 277.2325688 43 1835.333333 a8 b34 c451 d13 e1 291.1701835 34 1939.333333 a16 b33 c6 d18 e39 305.1077982 44 2043.333333 a11 b45 c21 d28 e69 0.9 89 340 a16 b35 c7 d8 e39 478 88 12 a13 b32 c6 d3 e39 443 34 30 a12 b33 c4 d8 e11 22 38 122 a8 b46 c4 d8 e11 907 27 122 a16 b32 c8 d3 e29 358 80 30 a11 b47 c7 d8 e11 68 32 345 a11 b45 c21 d33 e21 0.9 24 330 a12 b32 c5 d3 e79 543 34 30 a16 b34 c4 d8 e11 22 69 122 a12 b46 c101 d33 e1 43.1 35 67.33333333 a8 b33 c6 d13 e79 647 28 171.3333333 a12 b33 c5 d18 e9 64.2 36 275.3333333 a13 b47 c102 d48 e19 85.3 30 379.3333333 a9 b34 c451 d23 e29 228 37 483.3333333 a16 b34 c6 d28 e29 106.4 18 587.3333333 a14 b48 c6 d23 e39 127.5 38 691.3333333 a13 b35 c100 d33 e19 570 29 795.3333333 a12 b47 c7 d38 e49 148.6 39 899.3333333 a15 b49 c102 d33 e59 169.7 23 1003.333333 a14 b36 c451 d43 e69 322 40 1107.333333 a16 b49 c8 d48 e69 190.8 41 1211.333333 a16 b50 c100 d3 e79 211.9 41 1315.333333 a6 b37 c102 d53 e19 417 77 1419.333333 a12 b32 c5 d58 e89 233 8 1523.333333 a8 b33 c51 d38 e21 249.3573394 42 1627.333333 a13 b34 c5 d63 e39 263.2949541 23 1731.333333 a13 b46 c7 d48 e11 277.2325688 43 1835.333333 a16 b34 c102 d13 e1 291.1701835 90 1939.333333 a15 b37 c6 d18 e19 305.1077982 44 2043.333333 a11 b45 c21 d18 e39 0.9 89 340 a7 b46 c8 d18 e11 749 7 12 a8 b32 c101 d3 e79 443 34 30 a14 b45 c4 d8 e11 22 49 122 a9 b48 c4 d8 e11 538 48 122 a8 b32 c451 d3 e9 947 64 30 a11 b32 c5 d8 e11 579 32 345 a11 b45 c451 d18 e21 0.9 9 330 a16 b32 c101 d3 e39 391 34 30 a6 b37 c4 d8 e11 22 86 122 a12 b46 c8 d3 e1 43.1 35 67.33333333 a8 b33 c5 d13 e1 974 47 171.3333333 a15 b48 c5 d18 e9 64.2 36 275.3333333 a13 b47 c21 d28 e19 85.3 68 379.3333333 a9 b34 c101 d23 e39 23 37 483.3333333 a13 b46 c6 d28 e29 106.4 100 587.3333333 a14 b48 c102 d18 e39 127.5 38 691.3333333 a9 b35 c101 d33 e79 420 8 795.3333333 a12 b34 c7 d38 e49 148.6 39 899.3333333 a15 b49 c21 d8 e59 169.7 97 1003.333333 a13 b36 c451 d43 e21 138 40 1107.333333 a12 b33 c8 d48 e69 190.8 25 1211.333333 a16 b50 c6 d13 e79 211.9 41 1315.333333 a14 b37 c101 d53 e79 963 10 1419.333333 a12 b37 c6 d58 e89 233 69 1523.333333 a7 b33 c51 d38 e21 249.3573394 42 1627.333333 a11 b33 c5 d63 e9 263.2949541 56 1731.333333 a13 b46 c100 d8 e11 277.2325688 43 1835.333333 a9 b34 c8 d13 e1 291.1701835 62 1939.333333 a12 b35 c6 d18 e29 305.1077982 44 2043.333333 a11 b45 c21 d33 e11 0.9 89 340 a7 b33 c21 d18 e39 358 38 12 a12 b32 c6 d3 e39 443 34 30 a13 b37 c4 d8 e11 22 80 122 a13 b49 c4 d8 e11 268 43 122 a16 b32 c8 d3 e29 22 41 30 a11 b47 c100 d8 e11 325 32 345 a11 b45 c6 d23 e21 0.9 72 330 a8 b32 c100 d3 e11 388 34 30 a14 b35 c4 d8 e11 22 27 122 a12 b46 c8 d33 e1 43.1 35 67.33333333 a14 b33 c101 d13 e21 642 19 171.3333333 a14 b46 c5 d18 e9 64.2 36 275.3333333 a13 b47 c4 d28 e19 85.3 40 379.3333333 a14 b34 c21 d23 e9 284 37 483.3333333 a15 b45 c6 d28 e29 106.4 84 587.3333333 a14 b48 c21 d8 e39 127.5 38 691.3333333 a9 b35 c102 d33 e19 287 21 795.3333333 a9 b48 c7 d38 e49 148.6 39 899.3333333 a15 b49 c6 d23 e59 169.7 17 1003.333333 a11 b36 c4 d43 e29 544 40 1107.333333 a15 b34 c8 d48 e69 190.8 94 1211.333333 a16 b50 c100 d28 e79 211.9 41 1315.333333 a8 b37 c5 d53 e79 423 45 1419.333333 a12 b45 c7 d58 e89 233 47 1523.333333 a16 b33 c51 d38 e21 249.3573394 42 1627.333333 a14 b49 c5 d63 e11 263.2949541 60 1731.333333 a13 b46 c5 d48 e11 277.2325688 43 1835.333333 a11 b34 c8 d13 e1 291.1701835 24 1939.333333 a14 b33 c6 d18 e69 305.1077982 44 2043.333333 a11 b45 c21 d23 e9 0.9 89 340 a15 b49 c21 d38 e21 491 48 12 a14 b32 c100 d3 e1 443 34 30 a6 b34 c4 d8 e11 22 100 122 a6 b35 c4 d8 e11 888 29 122 a15 b32 c101 d3 e9 569 59 30 a11 b37 c4 d8 e11 227 32 345 a11 b45 c21 d3 e21 0.9 19 330 a6 b32 c5 d3 e69 289 34 30 a7 b49 c4 d8 e11 22 11 122 a12 b46 c8 d53 e1 43.1 35 67.33333333 a16 b33 c5 d13 e39 847 94 171.3333333 a16 b49 c5 d18 e9 64.2 36 275.3333333 a13 b47 c4 d18 e19 85.3 98 379.3333333 a14 b34 c8 d23 e79 263 37 483.3333333 a8 b35 c6 d28 e29 106.4 48 587.3333333 a14 b48 c451 d48 e39 127.5 38 691.3333333 a7 b35 c21 d33 e9 280 55 795.3333333 a12 b47 c7 d38 e49 148.6 39 899.3333333 a15 b49 c5 d53 e59 169.7 47 1003.333333 a6 b36 c21 d43 e9 655 40 1107.333333 a8 b48 c8 d48 e69 190.8 60 1211.333333 a16 b50 c21 d18 e79 211.9 41 1315.333333 a7 b37 c451 d53 e69 492 15 1419.333333 a12 b32 c6 d58 e89 233 61 1523.333333 a6 b33 c51 d33 e21 249.3573394 42 1627.333333 a9 b45 c5 d63 e69 263.2949541 80 1731.333333 a13 b46 c100 d53 e11 277.2325688 43 1835.333333 a8 b34 c6 d13 e1 291.1701835 43 1939.333333 a14 b48 c6 d18 e1 305.1077982 44 2043.333333 a11 b45 c21 d48 e39 0.9 89 340 a7 b32 c6 d38 e29 819 23 12 a7 b32 c451 d3 e1 443 34 30 a9 b37 c4 d8 e11 22 63 122 a15 b37 c4 d8 e11 269 26 122 a15 b32 c451 d3 e9 411 95 30 a11 b33 c7 d8 e11 916 32 345 a11 b45 c101 d23 e21 0.9 86 330 a9 b32 c4 d3 e21 588 34 30 a16 b37 c4 d8 e11 22 29 122 a12 b46 c8 d53 e1 43.1 35 67.33333333 a15 b33 c101 d13 e19 807 80 171.3333333 a7 b48 c5 d18 e9 64.2 36 275.3333333 a13 b47 c6 d28 e19 85.3 51 379.3333333 a9 b34 c4 d23 e21 7 37 483.3333333 a9 b49 c6 d28 e29 106.4 43 587.3333333 a14 b48 c102 d18 e39 127.5 38 691.3333333 a6 b35 c7 d33 e69 191 61 795.3333333 a7 b32 c7 d38 e49 148.6 39 899.3333333 a15 b49 c8 d3 e59 169.7 57 1003.333333 a14 b36 c102 d43 e1 588 40 1107.333333 a9 b45 c8 d48 e69 190.8 21 1211.333333 a16 b50 c8 d8 e79 211.9 41 1315.333333 a15 b37 c5 d53 e49 136 58 1419.333333 a12 b49 c7 d58 e89 233 88 1523.333333 a11 b33 c51 d53 e21 249.3573394 42 1627.333333 a15 b33 c5 d63 e1 263.2949541 96 1731.333333 a13 b46 c100 d8 e11 277.2325688 43 1835.333333 a16 b34 c101 d13 e1 291.1701835 17 1939.333333 a16 b47 c6 d18 e69 305.1077982 44 2043.333333

Sandy4321 commented 2 years ago

data has clear patterns , so it should be easy to build regressor to find these rules image

jackgerrits commented 2 years ago

Not every loss function works best in every scenario. There is a wiki page which goes into the different choices: https://github.com/VowpalWabbit/vowpal_wabbit/wiki/Loss-functions

Additionally, the two configurations you are testing have different learning rates.

Sandy4321 commented 2 years ago

great thanks for soon answer Additionally, the two configurations you are testing have different learning rates. even for the same learning rates they are very different and quantile is significantly down performing

Sandy4321 commented 2 years ago

quantile regression is the same as loss_function = 'squared' when tau = 0.5??

jackgerrits commented 2 years ago

The loss functions calculate loss differently. The formulae can be seen in the wiki page: https://github.com/VowpalWabbit/vowpal_wabbit/wiki/Loss-functions

Sandy4321 commented 2 years ago

may you share links with more details , the equation is not clear here?

jackgerrits commented 2 years ago
Sandy4321 commented 2 years ago

https://stats.stackexchange.com/questions/39002/when-is-quantile-regression-worse-than-ols

then quantile regression should not be significantly worse than least square regression ??

Sandy4321 commented 2 years ago

for example from link above If we use squared loss as a measure of success, quantile regression will be worse than OLS. On the other hand, if we use absolute value loss, quantile regression will be better

image

then for our case if calculations done correctly, we should get nearly the same result ?

Sandy4321 commented 2 years ago

Labs even is better for practical use than Lsq since it is more robust

Sandy4321 commented 2 years ago

https://medium.com/human-in-a-machine-world/mae-and-rmse-which-metric-is-better-e60ac3bde13d Conclusion RMSE has the benefit of penalizing large errors more so can be more appropriate in some cases, for example, if being off by 10 is more than twice as bad as being off by 5. But if being off by 10 is just twice as bad as being off by 5, then MAE is more appropriate. From an interpretation standpoint, MAE is clearly the winner. RMSE does not describe average error alone and has other implications that are more difficult to tease out and understand

Sandy4321 commented 2 years ago

squared loss nearly the same as absolute loss https://stats.stackexchange.com/questions/470626/why-is-using-squared-error-the-standard-when-absolute-error-is-more-relevant-to

I think the reason is more sociological that statistical

jackgerrits commented 2 years ago

@Sandy4321 Can you please provide an end to end reproducible example so that we can take a look and see if results make sense? Include in the code how you're calculating loss and confidence intervals?

marco-rossi29 commented 2 years ago

Hi Sandy, Can you expand on the rationale behind your statement when considering L_sq and L_abs:

then for our case if calculations done correctly, we should get nearly the same result ?

I'm curious to understand why optimizing two different cost functions should lead to the same optima

Sandy4321 commented 2 years ago

@marco-rossi29 I'm curious to understand why optimizing two different cost functions should lead to the same optima pls see links above with wide discussion why

for example
in link https://stats.stackexchange.com/questions/470626/why-is-using-squared-error-the-standard-when-absolute-error-is-more-relevant-to

they wrote I think the reason is more sociological that statistical

Sandy4321 commented 2 years ago

image

Sandy4321 commented 2 years ago

code

!/usr/bin/env python

coding: utf-8

S_VW_usualRegressoin_and_quantile_example_Nov30_for_VW_people.py

import numpy as np import matplotlib.pyplot as plt from vowpalwabbit import pyvw import pandas as pd import pandas as pd

Flag_use_quantile = 0

df_full = pd.read_excel ('synthetic_data_for_VW_Nov30_2021.xlsx') df_full df_full.shape df_full_N = df_full.dropna() df_full_N.columns cat = ['A','B','C','D','E'] cont= ['F', 'G'] df_full_N.shape colY = df_full_N['target'] colX = df_full_N.drop(['target'], axis = 1) from sklearn.metrics import mean_absolute_error from vowpalwabbit.sklearn_vw import VWRegressor from vowpalwabbit.DFtoVW import DFtoVW import pandas as pd from vowpalwabbit.DFtoVW import SimpleLabel, Namespace, Feature

category namespace

category_features = cat continous_features = cont C = Namespace(features=[Feature(col) for col in category_features], name="C") N = Namespace(features=[Feature(col) for col in continous_features], name="F") label_VW_format = SimpleLabel('target') converter_advanced = DFtoVW(df= df_full_N, namespaces=[C, N ], label=label_VW_format) data_VW_Format_with_NameSpaces = converter_advanced.convert_df() model_VWRegressor_from_sklearnVW = VWRegressor(convert_to_vw = False ,normalized = True, passes = 7, readable_model = 'my_VW.model' , cache_file = 'my_VW.cache' , learning_rate = 0.9, loss_function = 'squared' , l2 = 0.0001) model_VWRegressor_from_sklearnVW.fit(data_VW_Format_with_NameSpaces) my_predicted = model_VWRegressor_from_sklearnVW.predict(data_VW_Format_with_NameSpaces) mae = mean_absolute_error(colY,my_predicted) print('mae', mae) plt.figure(figsize=(15, 15)) plt.plot(colY, my_predicted, '.c', label='predicted') plt.plot(colY, colY, label='actual ') plt.legend() plt.title("no optimisation vowpal wabbit model with interactions VW data"+ "\n mae " + str(mae)) plt.show(block = False)

defined_l1= [0.001, 0.0001, 0.00001] defined_l2 = [ 0.001, 0.0001, 0.00001] defined_passes = [1, 7, 30, 60, 120, 240, 480, 1000]

Hyperparamters Tunning

mae_old = 9876543 #mae for i in range (len(defined_l1)): l1 = defined_l1[i] for j in range (len(defined_l2)): l2 = defined_l2[j] for k in range(len(defined_passes)): passes = defined_passes[k] if Flag_use_quantile: model = VWRegressor(convert_to_vw = False ,normalized = True, passes = passes, power_t = 0.5, #1.0, readable_model = 'my_VW.model' , cache_file = 'my_VW.cache' , learning_rate = 2.1 , l2 = l2, l1=l1, quadratic= 'CC' , cubic = 'CCC', loss_function = 'quantile' , quantile_tau = 0.5) q=0 else: model = VWRegressor(convert_to_vw = False ,normalized = True, passes = passes, power_t = 0.5, #1.0, readable_model = 'my_VW.model' , cache_file = 'my_VW.cache' , learning_rate = 2.1, loss_function = 'squared' , l2 = l2, l1=l1, quadratic= 'CC' , cubic = 'CCC' ) q=0 model.fit(data_VW_Format_with_NameSpaces) y_predict = model.predict(data_VW_Format_with_NameSpaces) mae_new = mean_absolute_error(colY,y_predict) if mae_new < mae_old:

print (l1,l2,passes,mae_new)

            mae_old = mae_new
            l1_final = l1
            l2_final = l2
            passes_final = passes
            y_predict_final = y_predict
            mae_final = mae_old
            print('\n \n FOUND BETTER Parameters :  passes= ', passes, ' l1 = ', l1, ' l2=' , l2 , '  mae_final=', int(mae_final) )
            q=0
        else:
            print('\n passes= ', passes, ' ;  l1= ', l1, ' ; l2=' , l2, ' ; mae_new = ' , int(mae_new) , '   mae_final=' , int(mae_final) )
            q=0

        q=0

In[35]:

plt.figure(figsize=(15, 15)) plt.plot(colY, y_predict_final, '.c', label='predicted') plt.plot(colY, colY, label='actual ') plt.legend() if Flag_use_quantile: plt.title("loss_function = 'quantile' : vowpal wabbit using hyperpramters optimisation"+ "\n mae " + str(mae_final) + '; optimal l1=' + str(l1_final) + '; optimal l2 ' + str(l2_final) + ' ; optimal # passes ' + str(passes_final) ) else: plt.title("loss_function = 'squared' : vowpal wabbit using hyperpramters optimisation "+ "\n mae " + str(mae_final) + '; optimal l1=' + str(l1_final) + '; optimal l2 ' + str(l2_final) + ' ; optimal # passes ' + str(passes_final) ) plt.show(block = False) q=0

Sandy4321 commented 2 years ago

@marco-rossi29

for deeper understanding about functional distances you can read https://www.amazon.ca/Robust-Statistics-Approach-Influence-Functions/dp/0471735779 Robust Statistics: The Approach Based on Influence Functions Paperback – April 6 2005 by Peter J. Rousseeuw (Author), Frank R. Hampel (Author), Elvezio M. Ronchetti (Author), & 1 more

Sandy4321 commented 2 years ago

image

Sandy4321 commented 2 years ago

code

!/usr/bin/env python

coding: utf-8

S_VW_usualRegressoin_and_quantile_example_Nov30_for_VW_people.py

import numpy as np import matplotlib.pyplot as plt import pandas as pd

Flag_use_quantile = 1

df_full = pd.read_excel ('synthetic_data_for_VW_Nov30_2021.xlsx') df_full df_full.shape df_full_N = df_full.dropna() df_full_N.columns cat = ['A','B','C','D','E'] cont= ['F', 'G'] df_full_N.shape colY = df_full_N['target'] colX = df_full_N.drop(['target'], axis = 1) from sklearn.metrics import mean_absolute_error from vowpalwabbit.sklearn_vw import VWRegressor from vowpalwabbit.DFtoVW import DFtoVW import pandas as pd from vowpalwabbit.DFtoVW import SimpleLabel, Namespace, Feature

category namespace

category_features = cat continous_features = cont C = Namespace(features=[Feature(col) for col in category_features], name="C") N = Namespace(features=[Feature(col) for col in continous_features], name="F") label_VW_format = SimpleLabel('target') converter_advanced = DFtoVW(df= df_full_N, namespaces=[C, N ], label=label_VW_format) data_VW_Format_with_NameSpaces = converter_advanced.convert_df() model_VWRegressor_from_sklearnVW = VWRegressor(convert_to_vw = False ,normalized = True, passes = 7, readable_model = 'my_VW.model' , cache_file = 'my_VW.cache' , learning_rate = 0.9, loss_function = 'squared' , l2 = 0.0001) model_VWRegressor_from_sklearnVW.fit(data_VW_Format_with_NameSpaces) my_predicted = model_VWRegressor_from_sklearnVW.predict(data_VW_Format_with_NameSpaces) mae = mean_absolute_error(colY,my_predicted) print('mae', mae) plt.figure(figsize=(15, 15)) plt.plot(colY, my_predicted, '.c', label='predicted') plt.plot(colY, colY, label='actual ') plt.legend() plt.title("no optimisation vowpal wabbit model with interactions VW data"+ "\n mae " + str(mae)) plt.show(block = False)

defined_l1= [0.001, 0.0001, 0.00001] defined_l2 = [ 0.001, 0.0001, 0.00001] defined_passes = [1, 7, 30, 60, 120, 240, 480, 1000]

Hyperparamters Tunning

mae_old = 9876543 #mae for i in range (len(defined_l1)): l1 = defined_l1[i] for j in range (len(defined_l2)): l2 = defined_l2[j] for k in range(len(defined_passes)): passes = defined_passes[k] if Flag_use_quantile: model = VWRegressor(convert_to_vw = False ,normalized = True, passes = passes, power_t = 0.5, #1.0, readable_model = 'my_VW.model' , cache_file = 'my_VW.cache' , learning_rate = 2.1 , l2 = l2, l1=l1, quadratic= 'CC' , cubic = 'CCC', loss_function = 'quantile' , quantile_tau = 0.5) q=0 else: model = VWRegressor(convert_to_vw = False ,normalized = True, passes = passes, power_t = 0.5, #1.0, readable_model = 'my_VW.model' , cache_file = 'my_VW.cache' , learning_rate = 2.1, loss_function = 'squared' , l2 = l2, l1=l1, quadratic= 'CC' , cubic = 'CCC' ) q=0 model.fit(data_VW_Format_with_NameSpaces) y_predict = model.predict(data_VW_Format_with_NameSpaces) mae_new = mean_absolute_error(colY,y_predict) if mae_new < mae_old:

print (l1,l2,passes,mae_new)

            mae_old = mae_new
            l1_final = l1
            l2_final = l2
            passes_final = passes
            y_predict_final = y_predict
            mae_final = mae_old
            print('\n \n FOUND BETTER Parameters :  passes= ', passes, ' l1 = ', l1, ' l2=' , l2 , '  mae_final=', int(mae_final) )
            q=0
        else:
            print('\n passes= ', passes, ' ;  l1= ', l1, ' ; l2=' , l2, ' ; mae_new = ' , int(mae_new) , '   mae_final=' , int(mae_final) )
            q=0

        q=0

plt.figure(figsize=(15, 15)) plt.plot(colY, y_predict_final, '.c', label='predicted') plt.plot(colY, colY, label='actual ') plt.legend() if Flag_use_quantile: plt.title("loss_function = 'quantile' : vowpal wabbit using hyperpramters optimisation"+ "\n mae " + str(mae_final) + '; optimal l1=' + str(l1_final) + '; optimal l2 ' + str(l2_final) + ' ; optimal # passes ' + str(passes_final) ) else: plt.title("loss_function = 'squared' : vowpal wabbit using hyperpramters optimisation "+ "\n mae " + str(mae_final) + '; optimal l1=' + str(l1_final) + '; optimal l2 ' + str(l2_final) + ' ; optimal # passes ' + str(passes_final) ) plt.show(block = False) q=0

Sandy4321 commented 2 years ago

@jackgerrits "Can you please provide an end to end reproducible example so that we can take a look and see if results make sense? Include in the code how you're calculating loss and confidence intervals?"

here you are code and plots

data I copied above in our correspondence earlier if for your convenience you need actual data file give me your email and I will send you actual data file

thank you vey much for willing to help I really need quantile regression works ...

marco-rossi29 commented 2 years ago

@marco-rossi29 I'm curious to understand why optimizing two different cost functions should lead to the same optima pls see links above with wide discussion why

for example in link https://stats.stackexchange.com/questions/470626/why-is-using-squared-error-the-standard-when-absolute-error-is-more-relevant-to

they wrote I think the reason is more sociological that statistical

the link is simply stating that "L2 cost function is the standard cost function", which may or may not be true (I can think of many reason why L2 is wildly used), but it's a tangent for our discussion. L1 and L2 have in general different optima and expecting that they have the same optima is just a hope. KKT conditions of the problem are in general different, and therefore the optima are in general different. Sure there can be cases where the optima coincide, but that's the exception.

Just think about using L2 norm or L1 for regularization, it's a very well known fact that L1 induce sparsity while L2 doesn't

marco-rossi29 commented 2 years ago

@marco-rossi29

for deeper understanding about functional distances you can read https://www.amazon.ca/Robust-Statistics-Approach-Influence-Functions/dp/0471735779 Robust Statistics: The Approach Based on Influence Functions Paperback – April 6 2005 by Peter J. Rousseeuw (Author), Frank R. Hampel (Author), Elvezio M. Ronchetti (Author), & 1 more

Thank you so much for this.

Sandy4321 commented 2 years ago

@marco-rossi29

this link has another link https://stats.stackexchange.com/questions/147001/is-minimizing-squared-error-equivalent-to-minimizing-absolute-error-why-squared

read for example text starting from When minimizing an error, we must decide how to penalize these errors.

ataymano commented 2 years ago

Thank you for the code snippet. Can you please format it as code next time - otherwise formatting is ruined?

Looks like there are 2 factors here: 1) you are always trying non-trivial l2 regularization which looks like suppressing your l1 learning completely. Can you please include 0 in your regularization parameter grid? 2) without regularization it is possible to learn something reasonable with quantile loss (although worse than with squared one), but it is significantly slower and requires much higher passes/learning rate (not sure if there is a bug here - need to look deeper). Here is simplified example to reproduce:

n = 1000
y = list(range(n))
data = [f'{i} | f:{i}' for i in y]

learning_rate = 1
passes = 4
loss = 'quantile'

regressor = VWRegressor(
      convert_to_vw = False,
      normalized = True,
      passes = passes,
      learning_rate = learning_rate,
      loss_function = loss,
      quantile_tau = 0.5)

regressor.fit(data)
predictions = regressor.predict(data)

plt.figure(figsize=(6, 6))
plt.title(f'loss={loss}, learning_rate={learning_rate}, passes={passes}')
plt.plot(y, predictions, '.c', label='predicted')
plt.plot(y, y, label='actual ')
plt.legend()

Which gives that kind of plots: image image image image image

ataymano commented 2 years ago

Also looks like --coin helps here to avoid learning_rate tuning: VWRegressor(normalized=True, learning_rate=...) -> VWRegressor(coin=True...)

Sandy4321 commented 2 years ago

@ataymano thanks for willing to help but main issue here that my data example is not trivial (very simple as yours data) main difficulty to fix bug in VW quantile or show there is no bug but there is my code mistake

Thanks Sander PS do you mean this for code formating `

!/usr/bin/env python

coding: utf-8

S_VW_usualRegressoin_and_quantile_example_Nov30_for_VW_people.py

import numpy as np import matplotlib.pyplot as plt import pandas as pd

Flag_use_quantile = 1

df_full = pd.read_excel ('synthetic_data_for_VW_Nov30_2021.xlsx') df_full df_full.shape df_full_N = df_full.dropna() df_full_N.columns cat = ['A','B','C','D','E'] cont= ['F', 'G'] df_full_N.shape colY = df_full_N['target'] colX = df_full_N.drop(['target'], axis = 1) from sklearn.metrics import mean_absolute_error from vowpalwabbit.sklearn_vw import VWRegressor from vowpalwabbit.DFtoVW import DFtoVW import pandas as pd from vowpalwabbit.DFtoVW import SimpleLabel, Namespace, Feature

creat categorical data namespace

category_features = cat continous_features = cont C = Namespace(features=[Feature(col) for col in category_features], name="C") N = Namespace(features=[Feature(col) for col in continous_features], name="F") label_VW_format = SimpleLabel('target') converter_advanced = DFtoVW(df= df_full_N, namespaces=[C, N ], label=label_VW_format) data_VW_Format_with_NameSpaces = converter_advanced.convert_df() model_VWRegressor_from_sklearnVW = VWRegressor(convert_to_vw = False ,normalized = True, passes = 7, readable_model = 'my_VW.model' , cache_file = 'my_VW.cache' , learning_rate = 0.9, loss_function = 'squared' , l2 = 0.0001) model_VWRegressor_from_sklearnVW.fit(data_VW_Format_with_NameSpaces) my_predicted = model_VWRegressor_from_sklearnVW.predict(data_VW_Format_with_NameSpaces) mae = mean_absolute_error(colY,my_predicted) print('mae', mae) plt.figure(figsize=(15, 15)) plt.plot(colY, my_predicted, '.c', label='predicted') plt.plot(colY, colY, label='actual ') plt.legend() plt.title("no optimisation vowpal wabbit model with interactions VW data"+ "\n mae " + str(mae)) plt.show(block = False)

defined_l1= [0.001, 0.0001, 0.00001] defined_l2 = [ 0.001, 0.0001, 0.00001] defined_passes = [1, 7, 30, 60, 120, 240, 480, 1000]

Hyperparamters Tunning

mae_old = 9876543 #mae for i in range (len(defined_l1)): l1 = defined_l1[i] for j in range (len(defined_l2)): l2 = defined_l2[j] for k in range(len(defined_passes)): passes = defined_passes[k] if Flag_use_quantile: model = VWRegressor(convert_to_vw = False ,normalized = True, passes = passes, power_t = 0.5, #1.0, readable_model = 'my_VW.model' , cache_file = 'my_VW.cache' , learning_rate = 2.1 , l2 = l2, l1=l1, quadratic= 'CC' , cubic = 'CCC', loss_function = 'quantile' , quantile_tau = 0.5) q=0 else: model = VWRegressor(convert_to_vw = False ,normalized = True, passes = passes, power_t = 0.5, #1.0, readable_model = 'my_VW.model' , cache_file = 'my_VW.cache' , learning_rate = 2.1, loss_function = 'squared' , l2 = l2, l1=l1, quadratic= 'CC' , cubic = 'CCC' ) q=0 model.fit(data_VW_Format_with_NameSpaces) y_predict = model.predict(data_VW_Format_with_NameSpaces) mae_new = mean_absolute_error(colY,y_predict) if mae_new < mae_old:

print (l1,l2,passes,mae_new)

            mae_old = mae_new
            l1_final = l1
            l2_final = l2
            passes_final = passes
            y_predict_final = y_predict
            mae_final = mae_old
            print('\n \n FOUND BETTER Parameters :  passes= ', passes, ' l1 = ', l1, ' l2=' , l2 , '  mae_final=', int(mae_final) )
            q=0
        else:
            print('\n passes= ', passes, ' ;  l1= ', l1, ' ; l2=' , l2, ' ; mae_new = ' , int(mae_new) , '   mae_final=' , int(mae_final) )
            q=0

        q=0

plt.figure(figsize=(15, 15)) plt.plot(colY, y_predict_final, '.c', label='predicted') plt.plot(colY, colY, label='actual ') plt.legend() if Flag_use_quantile: plt.title("loss_function = 'quantile' : vowpal wabbit using hyperpramters optimisation"+ "\n mae " + str(mae_final) + '; optimal l1=' + str(l1_final) + '; optimal l2 ' + str(l2_final) + ' ; optimal # passes ' + str(passes_final) ) else: plt.title("loss_function = 'squared' : vowpal wabbit using hyperpramters optimisation "+ "\n mae " + str(mae_final) + '; optimal l1=' + str(l1_final) + '; optimal l2 ' + str(l2_final) + ' ; optimal # passes ' + str(passes_final) ) plt.show(block = False) q=0

`

ataymano commented 2 years ago

Can you please use triple backticks for code formatting: https://docs.github.com/en/github/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax#quoting-code

Sandy4321 commented 2 years ago

but it does not work as you see above image

Sandy4321 commented 2 years ago

'''

!/usr/bin/env python

coding: utf-8

S_VW_usualRegressoin_and_quantile_example_Nov30_for_VW_people.py

import numpy as np import matplotlib.pyplot as plt import pandas as pd

Flag_use_quantile = 1

df_full = pd.read_excel ('synthetic_data_for_VW_Nov30_2021.xlsx') df_full df_full.shape df_full_N = df_full.dropna() df_full_N.columns cat = ['A','B','C','D','E'] cont= ['F', 'G'] df_full_N.shape colY = df_full_N['target'] colX = df_full_N.drop(['target'], axis = 1) from sklearn.metrics import mean_absolute_error from vowpalwabbit.sklearn_vw import VWRegressor from vowpalwabbit.DFtoVW import DFtoVW import pandas as pd from vowpalwabbit.DFtoVW import SimpleLabel, Namespace, Feature

creat categorical data namespace

category_features = cat continous_features = cont C = Namespace(features=[Feature(col) for col in category_features], name="C") N = Namespace(features=[Feature(col) for col in continous_features], name="F") label_VW_format = SimpleLabel('target') converter_advanced = DFtoVW(df= df_full_N, namespaces=[C, N ], label=label_VW_format) data_VW_Format_with_NameSpaces = converter_advanced.convert_df() model_VWRegressor_from_sklearnVW = VWRegressor(convert_to_vw = False ,normalized = True, passes = 7, readable_model = 'my_VW.model' , cache_file = 'my_VW.cache' , learning_rate = 0.9, loss_function = 'squared' , l2 = 0.0001) model_VWRegressor_from_sklearnVW.fit(data_VW_Format_with_NameSpaces) my_predicted = model_VWRegressor_from_sklearnVW.predict(data_VW_Format_with_NameSpaces) mae = mean_absolute_error(colY,my_predicted) print('mae', mae) plt.figure(figsize=(15, 15)) plt.plot(colY, my_predicted, '.c', label='predicted') plt.plot(colY, colY, label='actual ') plt.legend() plt.title("no optimisation vowpal wabbit model with interactions VW data"+ "\n mae " + str(mae)) plt.show(block = False)

defined_l1= [0.001, 0.0001, 0.00001] defined_l2 = [ 0.001, 0.0001, 0.00001] defined_passes = [1, 7, 30, 60, 120, 240, 480, 1000]

Hyperparamters Tunning

mae_old = 9876543 #mae for i in range (len(defined_l1)): l1 = defined_l1[i] for j in range (len(defined_l2)): l2 = defined_l2[j] for k in range(len(defined_passes)): passes = defined_passes[k] if Flag_use_quantile: model = VWRegressor(convert_to_vw = False ,normalized = True, passes = passes, power_t = 0.5, #1.0, readable_model = 'my_VW.model' , cache_file = 'my_VW.cache' , learning_rate = 2.1 , l2 = l2, l1=l1, quadratic= 'CC' , cubic = 'CCC', loss_function = 'quantile' , quantile_tau = 0.5) q=0 else: model = VWRegressor(convert_to_vw = False ,normalized = True, passes = passes, power_t = 0.5, #1.0, readable_model = 'my_VW.model' , cache_file = 'my_VW.cache' , learning_rate = 2.1, loss_function = 'squared' , l2 = l2, l1=l1, quadratic= 'CC' , cubic = 'CCC' ) q=0 model.fit(data_VW_Format_with_NameSpaces) y_predict = model.predict(data_VW_Format_with_NameSpaces) mae_new = mean_absolute_error(colY,y_predict) if mae_new < mae_old:

print (l1,l2,passes,mae_new)

            mae_old = mae_new
            l1_final = l1
            l2_final = l2
            passes_final = passes
            y_predict_final = y_predict
            mae_final = mae_old
            print('\n \n FOUND BETTER Parameters :  passes= ', passes, ' l1 = ', l1, ' l2=' , l2 , '  mae_final=', int(mae_final) )
            q=0
        else:
            print('\n passes= ', passes, ' ;  l1= ', l1, ' ; l2=' , l2, ' ; mae_new = ' , int(mae_new) , '   mae_final=' , int(mae_final) )
            q=0

        q=0

plt.figure(figsize=(15, 15)) plt.plot(colY, y_predict_final, '.c', label='predicted') plt.plot(colY, colY, label='actual ') plt.legend() if Flag_use_quantile: plt.title("loss_function = 'quantile' : vowpal wabbit using hyperpramters optimisation"+ "\n mae " + str(mae_final) + '; optimal l1=' + str(l1_final) + '; optimal l2 ' + str(l2_final) + ' ; optimal # passes ' + str(passes_final) ) else: plt.title("loss_function = 'squared' : vowpal wabbit using hyperpramters optimisation "+ "\n mae " + str(mae_final) + '; optimal l1=' + str(l1_final) + '; optimal l2 ' + str(l2_final) + ' ; optimal # passes ' + str(passes_final) ) plt.show(block = False) q=0

'''

Sandy4321 commented 2 years ago
#!/usr/bin/env python
# coding: utf-8

#S_VW_usualRegressoin_and_quantile_example_Nov30_for_VW_people.py
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

Flag_use_quantile = 1

df_full = pd.read_excel ('synthetic_data_for_VW_Nov30_2021.xlsx')
df_full
df_full.shape
df_full_N = df_full.dropna()
df_full_N.columns
cat = ['A','B','C','D','E']
cont= ['F', 'G']
df_full_N.shape
colY = df_full_N['target']
colX = df_full_N.drop(['target'], axis = 1)
from sklearn.metrics import mean_absolute_error
from vowpalwabbit.sklearn_vw import VWRegressor
from vowpalwabbit.DFtoVW import DFtoVW
import pandas as pd
from vowpalwabbit.DFtoVW import SimpleLabel, Namespace, Feature
#    creat categorical data  namespace
category_features = cat
continous_features = cont
C = Namespace(features=[Feature(col) for col in category_features], name="C")
N = Namespace(features=[Feature(col) for col in continous_features], name="F")
label_VW_format = SimpleLabel('target')
converter_advanced = DFtoVW(df=  df_full_N, namespaces=[C, N ], label=label_VW_format)
data_VW_Format_with_NameSpaces = converter_advanced.convert_df()
model_VWRegressor_from_sklearnVW = VWRegressor(convert_to_vw = False ,normalized = True, passes = 7, readable_model = 'my_VW.model' , cache_file =  'my_VW.cache' , learning_rate = 0.9, loss_function = 'squared' , l2 = 0.0001)
model_VWRegressor_from_sklearnVW.fit(data_VW_Format_with_NameSpaces)
my_predicted = model_VWRegressor_from_sklearnVW.predict(data_VW_Format_with_NameSpaces)
mae = mean_absolute_error(colY,my_predicted)
print('mae', mae)
plt.figure(figsize=(15, 15))
plt.plot(colY, my_predicted, '.c', label='predicted')
plt.plot(colY, colY, label='actual ') 
plt.legend()
plt.title("no optimisation vowpal wabbit model with interactions VW data"+  "\n mae " + str(mae))
plt.show(block = False)

defined_l1= [0.001, 0.0001, 0.00001]
defined_l2 = [  0.001, 0.0001, 0.00001]
defined_passes = [1, 7, 30, 60, 120, 240, 480, 1000]

#  Hyperparamters Tunning

mae_old = 9876543 #mae
for i in range (len(defined_l1)):
    l1 = defined_l1[i]
    for j in range (len(defined_l2)):
        l2 = defined_l2[j]
        for k in range(len(defined_passes)):
            passes = defined_passes[k] 
            if Flag_use_quantile:
                model = VWRegressor(convert_to_vw = False ,normalized = True, 
                                                               passes = passes, 
                                                                power_t = 0.5, #1.0,
                                                               readable_model = 'my_VW.model' , cache_file =  'my_VW.cache' ,
                                                               learning_rate = 2.1 , l2 = l2, l1=l1,
                                                               quadratic= 'CC' , cubic = 'CCC',
                                                                loss_function = 'quantile' , quantile_tau = 0.5)
                q=0
            else:
                model = VWRegressor(convert_to_vw = False ,normalized = True, 
                                                          passes = passes, 
                                                           power_t = 0.5, #1.0,
                                                          readable_model = 'my_VW.model' , cache_file =  'my_VW.cache' ,
                                                          learning_rate = 2.1, loss_function = 'squared' , l2 = l2, l1=l1,
                                                          quadratic= 'CC' , cubic = 'CCC' )
                q=0
            model.fit(data_VW_Format_with_NameSpaces)
            y_predict = model.predict(data_VW_Format_with_NameSpaces)
            mae_new = mean_absolute_error(colY,y_predict)
            if mae_new < mae_old:
                #print (l1,l2,passes,mae_new)
                mae_old = mae_new
                l1_final = l1
                l2_final = l2
                passes_final = passes
                y_predict_final = y_predict
                mae_final = mae_old
                print('\n \n FOUND BETTER Parameters :  passes= ', passes, ' l1 = ', l1, ' l2=' , l2 , '  mae_final=', int(mae_final) )
                q=0
            else:
                print('\n passes= ', passes, ' ;  l1= ', l1, ' ; l2=' , l2, ' ; mae_new = ' , int(mae_new) , '   mae_final=' , int(mae_final) )
                q=0

            q=0

plt.figure(figsize=(15, 15))
plt.plot(colY, y_predict_final, '.c', label='predicted')
plt.plot(colY, colY, label='actual ') 
plt.legend()
if Flag_use_quantile:
    plt.title("loss_function = 'quantile' : vowpal wabbit using  hyperpramters optimisation"+  "\n mae " + str(mae_final) + '; optimal l1=' + str(l1_final) +  ';  optimal l2 '  + str(l2_final) + ' ; optimal # passes   ' + str(passes_final)  )
else:
    plt.title("loss_function = 'squared' : vowpal wabbit using  hyperpramters optimisation "+  "\n mae " + str(mae_final)  + '; optimal l1=' + str(l1_final) +  ';  optimal l2 '  + str(l2_final) + ' ; optimal # passes   ' + str(passes_final)  )
plt.show(block = False)
q=0
Sandy4321 commented 2 years ago

great I did it thanks

but still how to make VW quantile regression works ?

ataymano commented 2 years ago

Different loss functions may require different values of other hyperparameters in order to converge. 1.

defined_l1= [0.001, 0.0001, 0.00001]
defined_l2 = [  0.001, 0.0001, 0.00001]

Although you are sweeping over regularization parameters, you are always have them >0 and even minimal l2 regularization parameter is too high for your data with quantile loss. Please include 0's to both lists.

  1. You have your learning rate fixed to 2.1 and turned out that this value works well for squared loss, but not for quantile one - looks like something around 1000 gives small errors in your case. There are 2 ways to address that: a. Try different values of learning rate (similarly as you do with regularization) b. Try parameter free learner - i.e. coin betting. It requires to remove "normalized=True" and "learning_rate=..." from VWREgressor initialization and replace it with "coin=True".
Sandy4321 commented 2 years ago

b. Try parameter free learning learner - i.e. coin betting. It requires to remove "normalized=True" and "learning_rate=..." from VWREgressor initialization and replace it with "coin=True".

I can not find in web description how VW uses coin = True only something very general information https://cmsa.fas.harvard.edu/wp-content/uploads/2018/06/slides_Parameter-free-Machine-Learning-through-Coin-Betting.pdf

then may you clarify should I remove all parameters like l1 and l2

then it will be like this

                model = VWRegressor(convert_to_vw = False , coin=True
                                                          passes = passes, 
                                                           power_t = 0.5, #1.0,
                                                          readable_model = 'my_VW.model' , cache_file =  'my_VW.cache' ,
                                                           loss_function = 'squared' , l2 = l2, l1=l1,
                                                          quadratic= 'CC' , cubic = 'CCC' )

or
                model = VWRegressor(convert_to_vw = False ,
                                                          coin=True,
                                                          readable_model = 'my_VW.model' , cache_file =  'my_VW.cache' ,
                                                           loss_function = 'squared' , 
                                                          quadratic= 'CC' , cubic = 'CCC' )
Sandy4321 commented 2 years ago

@ataymano then question is

Is coin=True, optimize all parameters like l1, l2, power_t, passes or only "normalized=True" and "learning_rate= "

??

ataymano commented 2 years ago

coin betting is alternative to sgd. power_t/learning_rate/normalized are sgd parameters, so should be excluded if you are using coin. others (l1/l2/passes) can be used with any learning algorithm, so you may want to sweep over them.

Sandy4321 commented 2 years ago

I see thanks for soon answer

ataymano commented 2 years ago

I see thanks for soon answer

Sure, you're welcome. Did it help to get learning with quantile loss?

Sandy4321 commented 2 years ago

I am checking now... first try to understand why only for L1 =0 and L2 = 0 it starts to give meaningful results

Sandy4321 commented 2 years ago

@ataymano is it possible that coin optimizer does not depends from L1 and L2 for quantile regression

since I do changing L1 and L2 , but resulting prediction nearly not changing ?

VWRegressor(convert_to_vw = False ,
                                                                        coin= True, 
                                                                        passes= each_number_passes,
                                                                        readable_model = 'my_VW.model' , cache_file =  'my_VW.cache' ,
                                                                        l2= each_l2,
                                                                        l1= each_l1,
                                                                        quadratic= 'CC' , 
                                                                        cubic = 'CCC',
                                                                        loss_function = 'quantile' , quantile_tau = parameter_quantile_tau_low)
ataymano commented 2 years ago

oh right, looks like they are implemented as part of sgd as well (so are not affecting --coin)

Sandy4321 commented 2 years ago

great thanks for soon answer also my guess there is no data shuffling for each pass in VWRegressor? or there is but I just do not know how implement it ? it would be necessary to have random data shuffling for each pass?

Sandy4321 commented 2 years ago

@ataymano

https://stackoverflow.com/questions/20941180/does-vowpal-wabbit-shuffle-data-in-multiple-online-passes my guess there is no data shuffling for each pass in VWRegressor? or there is but I just do not know how implement it ? it would be necessary to have random data shuffling for each pass?

but VWRegressor is not online it is batch processing , then data shuffling for each pass needed for each batch

Sandy4321 commented 2 years ago

@ataymano aslo pls see https://stats.stackexchange.com/questions/81546/confusion-with-vowpal-wabbits-multiple-pass-behavior-when-performing-ridge-regr

I can bypass this obscurity with a work-around, by ignoring vw's --passes option and manually performing "passes" along with IO-intensive shuffling of data after each "pass":

ataymano commented 2 years ago

sklearn wrapper is doing shuffling on every pass: https://github.com/VowpalWabbit/vowpal_wabbit/blob/ceb2018df052df576bf28eb7a8cb0a7cb23ce11c/python/vowpalwabbit/sklearn_vw.py#L558

Sandy4321 commented 2 years ago

@ataymano sklearn wrapper is doing shuffling on every pass: strange I got better results after shuffling first before send data to VWRegressor

by the way what link to code for coin optimization source code? is it C++ code?

jackgerrits commented 2 years ago

Coin is implemented in this file: https://github.com/VowpalWabbit/vowpal_wabbit/blob/master/vowpalwabbit/ftrl.cc The paper for coin is here: https://arxiv.org/abs/1602.04128