LoadError: PosDefException: matrix is not positive definite; Cholesky factorization failed

yuexiao-marketingattribution commented 2 years ago

We have been using an older version of MixedModels (1.1.6) for a while and decided to upgrade to the latest 4.6.1. We ran into the subject errors with some of the models that used to run, though. One example with data is attached here. I did some experiment and it seems to be related to the fact that our model weights are highly correlated with the dependent variable. But this model ran without issue in the older version. I wonder if some tolerance changed since the older version. BTW, I was using Julia 1.1.0 before and attempted to upgrade it to 1.6.5 going with the latest MixedModels.

model setup is as follows (VERSION refers to julia version)-

using DataFrames, MixedModels, CSV
if Int(VERSION.minor) >= 6
  using CategoricalArrays
end
model_formula = @formula(
    volume ~ 1 + base_price + ppi + 
    (1 | brand_category_channel_type) + ((0 + ppi) | brand_category_channel_type) +
    (1 | channel_type_sub_brand) + ((0 + base_price) | channel_type_sub_brand)
)
if Int(VERSION.minor) < 6
    model_ready_data = CSV.read("model_ready_data_small.csv")
    weights = CSV.read("weights.csv")
else
    model_ready_data = CSV.read("model_ready_data_small.csv", DataFrame)
    weights = CSV.read("weights.csv", DataFrame)
end
weight_vector = Float64.(weights[!, :weight_vector])
if Int(VERSION.minor) < 6
    model = MixedModels.LinearMixedModel(model_formula, model_ready_data, weights=weight_vector);
else
    model_ready_data[!, "brand_category_channel_type"] = categorical(model_ready_data[!, "brand_category_channel_type"])
    model_ready_data[!, "channel_type_sub_brand"] = categorical(model_ready_data[!, "channel_type_sub_brand"])
    model = MixedModels.LinearMixedModel(model_formula, model_ready_data, wts = weight_vector)
end
model_fit = MixedModels.fit!(model);

model_ready_data_small.csv weights.csv

dmbates commented 2 years ago

Did you intend that row of zeros at the top of model_ready_data_small.csv?

yuexiao-marketingattribution commented 2 years ago

That record is actually in the data. Removing it doesn't seem to solve the problem, though.

Thanks!

On Tue, May 3, 2022 at 9:34 AM Douglas Bates @.***> wrote:

Did you intend that row of zeros at the top of model_ready_data_small.csv?

— Reply to this email directly, view it on GitHub https://github.com/JuliaStats/MixedModels.jl/issues/611#issuecomment-1116173589, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHERARHLEGN6GAEVJUN2KPTVIE2OHANCNFSM5U5IEO4Q . You are receiving this because you authored the thread.Message ID: @.***>

dmbates commented 2 years ago

I was successful in fitting several models but most of the variance component estimates ended up converging to zero. I'm not entirely sure that there isn't a problem with the update operation for this model and will look at it further. I was trying to upload a Jupyter notebook to accompany this reply but apparently I can't do that. I will create a gist with the notebook.

dmbates commented 2 years ago

Look at https://gist.github.com/dmbates/5b99dcbc3c3cc19b19a7c09a315e1fe3

yuexiao-marketingattribution commented 2 years ago

Thanks for the quick action on this! You are right that several minor tweaks of the model will make it work. But the one we would like to fit would fail. it has a random interaction for base_price as well.

volume ~ 1 + base_price + ppi +

(1 | brand_category_channel_type) + ((0 + ppi) |

brand_category_channel_type) + (1 | channel_type_sub_brand) + ((0 + base_price) | channel_type_sub_brand)

On Tue, May 3, 2022 at 1:12 PM Douglas Bates @.***> wrote:

Look at https://gist.github.com/dmbates/5b99dcbc3c3cc19b19a7c09a315e1fe3

— Reply to this email directly, view it on GitHub https://github.com/JuliaStats/MixedModels.jl/issues/611#issuecomment-1116407909, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHERARBR7O3YCXIWH4BUCJTVIFUCLANCNFSM5U5IEO4Q . You are receiving this because you authored the thread.Message ID: @.***>

yuexiao-marketingattribution commented 2 years ago

I just tried the last model you tested - It will fail on me as well. I will test your code on my model and let you know......

volume ~ 1 + base_price + ppi +

(1 | brand_category_channel_type) + ((0 + ppi) |

brand_category_channel_type) + (1 | channel_type_sub_brand) + ((0 + base_price) | channel_type_sub_brand)

On Tue, May 3, 2022 at 1:30 PM Yue Xiao @.***> wrote:

Thanks for the quick action on this! You are right that several minor tweaks of the model will make it work. But the one we would like to fit would fail. it has a random interaction for base_price as well.
volume ~ 1 + base_price + ppi +

(1 | brand_category_channel_type) + ((0 + ppi) |
brand_category_channel_type) + (1 | channel_type_sub_brand) + ((0 + base_price) | channel_type_sub_brand)

On Tue, May 3, 2022 at 1:12 PM Douglas Bates @.***> wrote:

Look at https://gist.github.com/dmbates/5b99dcbc3c3cc19b19a7c09a315e1fe3

— Reply to this email directly, view it on GitHub https://github.com/JuliaStats/MixedModels.jl/issues/611#issuecomment-1116407909, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHERARBR7O3YCXIWH4BUCJTVIFUCLANCNFSM5U5IEO4Q . You are receiving this because you authored the thread.Message ID: @.***>

yuexiao-marketingattribution commented 2 years ago

I tested a few other cases and frequently get into the PosDef exception below.

LoadError: PosDefException: matrix is not positive definite; Cholesky factorization failed.

Stacktrace:

[1] cholUnblocked!(A::SubArray{Float64, 2, Array{Float64, 3}, Tuple{Base.Slice{Base.OneTo{Int64}}, Base.Slice{Base.OneTo{Int64}}, Int64}, true}, #unused#::Type{Val{:L}})

@ MixedModels ~/.julia/packages/MixedModels/y3arD/src/linalg/ cholUnblocked.jl:25

[2] cholUnblocked!(D::UniformBlockDiagonal{Float64}, #unused#::Type {Val{:L}})

@ MixedModels ~/.julia/packages/MixedModels/y3arD/src/linalg/ cholUnblocked.jl:37

[3] updateL!(m::LinearMixedModel{Float64})

@ MixedModels ~/.julia/packages/MixedModels/y3arD/src/ linearmixedmodel.jl:1201

[4] (::MixedModels.var"#obj#67"{Bool, Int64, LinearMixedModel{Float64}, Vector{Tuple{Vector{Float64}, Float64}}, ProgressMeter.ProgressUnknown})(x ::Vector{Float64}, g::Vector{Float64})

@ MixedModels ~/.julia/packages/MixedModels/y3arD/src/ linearmixedmodel.jl:439

[5] fit!(m::LinearMixedModel{Float64}; progress::Bool, REML::Bool, σ::Nothing, thin::Int64)

@ MixedModels ~/.julia/packages/MixedModels/y3arD/src/ linearmixedmodel.jl:447

[6] fit!(m::LinearMixedModel{Float64})

@ MixedModels ~/.julia/packages/MixedModels/y3arD/src/ linearmixedmodel.jl:426

The issue seems to popup when there are multiple random interaction levels. I attached the data and code with the following model spec.

model_formula = @formula(

    volume ~ 1 + trade_promo + base_price + wd +

    (0 + base_price | ppg) +

    (0 + wd | ppg) +

    (0 + trade_promo | channel_type_ppg)

)

I tested multiple models. For these three variables, it seems like when we only interact with ppg, or when we only interact with channel_type_ppg, the model will pass. But when we have some variables interacting with ppg, other variables interact with channel_type_ppg, we start to have problems. When there are only 2 variables with random interaction in the model, various combinations of the interaction seem to work as well.

This makes me think it's more than a data issue here. Could you please take a look?

Thanks,

Yue

On Tue, May 3, 2022 at 2:59 PM Yue Xiao @.***> wrote:

So I tested a few models and here is my thought - If I ran your m3, I will see no variance for the intercept. This is consistent with the way data is constructed. Yet the model finished gracefully, all coefficients estimated, etc. Having the slope AND intercept interaction at the same time for ppi or base_price, one at a time, also works, with intercept variance at 0 for all these models. So if the only issue is that intercepts don't vary across interaction levels, shouldn't the model exit gracefully when I have the full model spec below? This used to be the case in older versions and it will greatly help us with our production batch process.
volume ~ 1 + base_price + ppi +

(1 | brand_category_channel_type) + ((0 + ppi) |
brand_category_channel_type) + (1 | channel_type_sub_brand) + ((0 + base_price) | channel_type_sub_brand)

On Tue, May 3, 2022 at 1:46 PM Yue Xiao @.***> wrote:
I just tried the last model you tested - It will fail on me as well. I will test your code on my model and let you know......
volume ~ 1 + base_price + ppi +

(1 | brand_category_channel_type) + ((0 + ppi) |
brand_category_channel_type) + (1 | channel_type_sub_brand) + ((0 + base_price) | channel_type_sub_brand)

On Tue, May 3, 2022 at 1:30 PM Yue Xiao < @.***> wrote:
Thanks for the quick action on this! You are right that several minor tweaks of the model will make it work. But the one we would like to fit would fail. it has a random interaction for base_price as well.
volume ~ 1 + base_price + ppi +

(1 | brand_category_channel_type) + ((0 + ppi) |
brand_category_channel_type) + (1 | channel_type_sub_brand) + ((0 + base_price) | channel_type_sub_brand)

On Tue, May 3, 2022 at 1:12 PM Douglas Bates @.***> wrote:

Look at https://gist.github.com/dmbates/5b99dcbc3c3cc19b19a7c09a315e1fe3

— Reply to this email directly, view it on GitHub https://github.com/JuliaStats/MixedModels.jl/issues/611#issuecomment-1116407909, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHERARBR7O3YCXIWH4BUCJTVIFUCLANCNFSM5U5IEO4Q . You are receiving this because you authored the thread.Message ID: @.***>

palday commented 2 years ago

I took a look at the original example, re-written here a little bit to make it easy to run anywhere:

using CategoricalArrays, CSV, DataFrames, Downloads, MixedModels

model_ready_url = "https://github.com/JuliaStats/MixedModels.jl/files/8606358/model_ready_data_small.csv";
model_ready_data = CSV.read(Downloads.download(model_ready_url), DataFrame);
weights_url = "https://github.com/JuliaStats/MixedModels.jl/files/8606359/weights.csv";
weights = CSV.read(Downloads.download(weights_url), DataFrame);
wts = weights[!, :weight_vector];

contrasts = Dict(
    :brand_category_channel_type => Grouping(),
    :channel_type_sub_brand => Grouping(),
);

model_formula = @formula(
    volume ~ 1 + base_price + ppi + 
    zerocorr(1 + ppi | brand_category_channel_type) +
    zerocorr(1 + base_price | channel_type_sub_brand)
);

model =  fit(MixedModel, model_formula, model_ready_data; wts, contrasts)

@dmbates I inserted a @show into the Cholesky factorization to get a peek at what's going wrong and found that the Cholesky is failing on this matrix:

4×4 Matrix{Float64}:
     -2.09724e7  34383.4    29916.1    -78929.7
 -34369.8        -1334.4   -10224.6        -1.22374e5
 -29893.4        10223.1    -3655.35  -163610.0
  78803.2        -5724.27   -3884.24        2.10496e6

which makes sense because the there are negative elements on the main diagonal. I have the distinct feeling that the data is ill conditioned because when I look at the trace the values fluctuate wildly (even taking into account that there are multiple dense blocks interleaved in the trace):

Trace of A for each iteration

```julia A = [275.3886110000001 0.0; 0.0 1.0] A = [26970.515544 10.51829954195982; 10.51829954195982 1.91260431352049] A = [1.0637780153520002e6 -1983.2690986188613; -1983.2690986188613 39493.976091712946] A = [2.619361352482001e6 4282.43157226412; 4282.43157226412 14092.787388238297] A = [1.1280034164830002e6 -820.6784915586895; -820.6784915586895 4964.560800495339] A = [452452.00636199996 458.1541350402469; 458.1541350402469 182.14373844265293] A = [146040.54100599996 0.0; 0.0 1.0] A = [276.264311 -0.9595332129561879; -0.9595332129561879 1.0114882822320845] A = [26649.671372999997 -38.137774279361565; -38.137774279361565 441.7677118764671] A = [84425.33053700002 160.83688134053415; 160.83688134053415 437.0625696066532] A = [32136.417662999997 72.83844581405658; 72.83844581405658 66.05000869434183] A = [5000.113452999999 1.3301974740865834; 1.3301974740865834 5.382268134134644] A = [10535.768357999998 -21.20670241298923; -21.20670241298923 10.733343123026957] A = [1474.5101679999998 0.1926021326267835; 0.1926021326267835 1.030623304486552] A = [426872.04782599985 0.0; 0.0 1.0] A = [583763.3572930001 4394.916133642304; 4394.916133642304 1409.0045891331497] A = [265773.106901 0.0; 0.0 1.0] A = [33271.942633000006 50.37014051994413; 50.37014051994413 57.9085242058123] A = [57222.555044999994 0.0; 0.0 1.0] A = [357177.7076040001 1172.8537253491788; 1172.8537253491788 1079.6533010106489] A = [50183.988809999995 64.45332399536164; 64.45332399536164 283.0134061707429] A = [3335.055819 -8.506364170352963; -8.506364170352963 12.048781925147088] A = [81.65136700000001 0.0; 0.0 1.0] A = [72897.05660400001 0.0; 0.0 1.0] A = [32401.206353999998 -51.38809177007144; -51.38809177007144 2060.8943884799387] A = [66052.47268000002 -279.5977284783337; -279.5977284783337 799.9971679173707] A = [21296.132530000003 0.0; 0.0 1.0] A = [4965.848212999999 4.631638997353715; 4.631638997353715 5.321389023848027] A = [10797.902261000003 -2.9172777010200157e-6; -2.9172777010200157e-6 12.265142258844865] A = [318098.473248 -119.54416827951218; -119.54416827951218 2260.6135073521036] A = [413402.9954079999 816.743595367312; 816.743595367312 1037.5977890090533] A = [135954.015712 0.0; 0.0 1.0] A = [23028.755915 -43.51435364839068; -43.51435364839068 52.271541525276774] A = [48371.125023 177.43938929084382; 177.43938929084382 35.74462004685384] A = [161097.86235999997 -158.4067215237261; -158.4067215237261 322.78039440234954] A = [30205.396813000003 44.66981137117053; 44.66981137117053 88.78724475730498] A = [2428.903709 0.8068184266978822; 0.8068184266978822 9.987568346277232] A = [328.7040559999999 -9.678899004494923e-8; -9.678899004494923e-8 1.1501483580555325] A = [20833.750557 1.8484707432409433; 1.8484707432409433 39.95554300955688] A = [910269.2826580001 635.4270031993026; 635.4270031993026 24693.620906233955] A = [1.9318545696419997e6 21762.14126398811; 21762.14126398811 23688.10170032355] A = [878789.201814 1663.5854210923158; 1663.5854210923158 2095.5081350902833] A = [327982.731103 0.0; 0.0 1.0] A = [118083.028825 0.0; 0.0 1.0] A = [181.32147 -0.14700409062397501; -0.14700409062397501 1.0022172930381041] A = [38605.205266000004 -24.593637932898645; -24.593637932898645 383.92849243894864] A = [106212.32890999998 266.0865604865825; 266.0865604865825 819.6559327799044] A = [30016.06241099999 20.80462026606226; 20.80462026606226 68.46164343856141] A = [13378.679689 32.898730919653055; 32.898730919653055 39.233988452213126] A = [915.644949 -1.5071896208862512e-7; -1.5071896208862512e-7 1.015078637083418] A = [252060.020108 0.0; 0.0 1.0] A = [621346.821625 0.0; 0.0 1.0] A = [110634.566327 -399.0505372783746; -399.0505372783746 917.4135872131402] A = [33458.289665 -83.44707236540337; -83.44707236540337 99.67739752424703] A = [31303.74954499999 322.1356811265711; 322.1356811265711 134.56785309036164] A = [44449.437801 78.17890526753243; 78.17890526753243 176.95352046140005] A = [352.891658 -12.388067575368199; -12.388067575368199 5.833122809320677] A = [41082.49293100001 0.0; 0.0 1.0] A = [684146.0138419999 -1734.5918498583499; -1734.5918498583499 11857.832365204269] A = [2.212374529104e6 10704.799646349427; 10704.799646349427 4961.569904861191] A = [897912.696569 0.0; 0.0 1.0] A = [58531.954456999985 0.0; 0.0 1.0] A = [14276.461853000003 157.25932094889245; 157.25932094889245 1682.205993163988] A = [39136.389089000004 -186.14667314280345; -186.14667314280345 165.61717224568736] A = [18335.152909999997 -16.083555503698925; -16.083555503698925 48.264775052827126] A = [1492.5685719999994 -4.813104634873771; -4.813104634873771 3.429814405477891] A = [4722.642945 0.0; 0.0 1.0] A = [160237.23470200002 -1362.0332278205956; -1362.0332278205956 17718.075143373797] A = [256687.56895899994 733.5998921280423; 733.5998921280423 1297.8975853451163] A = [78935.48453200002 -224.33517204232265; -224.33517204232265 401.18883812787834] A = [14587.610453999996 5.6183049352525565; 5.6183049352525565 57.78156237884673] A = [48405.008471 0.0; 0.0 1.0] A = [237755.12856599994 -44.967348451417664; -44.967348451417664 13.667593218320535] A = [15306.60698 -36.092268668450366; -36.092268668450366 42.48291261025591] A = [36.99278399999999 0.0; 0.0 1.0] A = [40041.533756 0.0; 0.0 1.0] A = [485862.1159290001 -2296.271618044505; -2296.271618044505 19494.752314392936] A = [962305.6347670001 -3959.4784579383763; -3959.4784579383763 27479.54673488647] A = [289543.656614 239.73620923502745; 239.73620923502745 356.38575990025436] A = [168290.71553000004 0.0; 0.0 1.0] A = [59696.79739899999 -44.24683646411123; -44.24683646411123 11.032246735569943] A = [5.436875217993902e6 9133.901880728996; 9133.901928957246 52347.60150044322] A = [159016.79209236588 1588.1851447894228; 1588.1402983643304 547.151210558372] A = [1.3683723917538924e6 -5045.438562810177; -5045.4404802184035 74298.59319623487] A = [483671.1259662385 -189.3308893553409; -189.332343649997 487.88938289093664] A = [135509.26069072008 894.2430411409549; 894.2393952041767 2469.882954564401] A = [938851.2978645571 0.0; 0.0 1.0] A = [214890.0920907084 0.0; 0.0 1.0] A = [4.1669748043278325e6 6879.517653854432; 6879.517913177157 23010.654891861246] A = [188389.3024729495 -395.04885835128465; -395.0456722108642 1210.6128692923232] A = [1.0184112520368345e6 -2659.7095508832504; -2659.7091933260112 6195.288778919975] A = [117184.938391316 -42.400570378120506; -42.399755180757694 223.2629833582394] A = [3.8529621874329704e6 -3047.346437564768; -3047.3462568742534 14887.51137805212] A = [77958.41099777636 -875.2781422778461; -875.2731850342324 3683.82293447747] A = [558848.7982492348 8392.646972669452; 8392.640106726652 17769.99466421015] A = [293137.0729294579 -50.57990652048062; -50.57945934756645 107.17627481023055] A = [1.9656948855745604e6 15332.743855229997; 15332.7442346963 14663.520052449101] A = [-2.0972427010079682e7 34383.411278323634 29916.105630271988 -78929.67867736067; -34369.8197156927 -1334.404182316827 -10224.62028309605 -122374.31793134139; -29893.376019959225 10223.085489326098 -3655.349680672196 -163610.03849458476; 78803.18837686801 -5724.271553533462 -3884.2415038617037 2.1049649628630173e6] ```

@yuexiao-marketingattribution There's only so much we can do to not error on ill-conditioned problems. It looks like you're pushing your models a bit beyond what your data support. We discuss this a bit in our documentation.

Part of the ill-conditioning here is that the many channel_type_sub_brands only have a single base_price:

julia> sort!(combine(groupby(model_ready_data, :channel_type_sub_brand), :base_price => length ∘ unique => :n_unique_base_price), :n_unique_base_price)
81×2 DataFrame
 Row │ channel_type_sub_brand  n_unique_base_price 
     │ Int64                   Int64               
─────┼─────────────────────────────────────────────
   1 │                      0                    1
   2 │                      6                    1
   3 │                     14                    1
   4 │                     16                    1
   5 │                     18                    1
   6 │                     22                    1
   7 │                     23                    1
   8 │                     26                    1
   9 │                     31                    1
  10 │                     42                    1
  11 │                     43                    1
  12 │                     50                    1
  13 │                     51                    1
  14 │                     57                    1
  15 │                     60                    1
  16 │                     61                    1
  17 │                     66                    1
  18 │                     71                    1
  19 │                     74                    1
  20 │                     75                    1
  21 │                     79                    1
  22 │                     37                    2
  23 │                     44                    3
  24 │                     49                    5
  25 │                      7                    6
  26 │                     33                    6
  27 │                      1                    7
  28 │                     28                    7
  29 │                     13                    8
  ⋮  │           ⋮                      ⋮

For these channel_type_sub_brand, the random slope will be the same as the population slope. This leads to the shrinkage to zero for the variance component that @dmbates noted in his Jupyter notebook.

@dmbates I'm wondering if there is nonetheless a way to catch this and either give a nicer error or include a tolerance that forces one of the earlier iterations to zero.

yuexiao-marketingattribution commented 2 years ago

I may have sent the wrong test code. So here it is, plus the consolidated data. Thanks!

On Wed, May 11, 2022 at 11:42 AM Yue Xiao @.***> wrote:

Thanks a lot for this check on the data. I checked the second data case and one of the variables also has no variation in value in many groups. However, it is quite common for us to have some variables to have limited variations in some groups. We don't want to exclude these groups as other variables in these groups could help improve the estimation. But we are fine if the model comes back and says, "Sorry, random effects for this variable are all 0 due to data. You only have a fixed effect for this variable."

In both cases, we have two random interaction terms. And one of the interaction levels is nested within the other. So I tested a model where all the independent variables have random interaction with the most granular level. The model passes in both cases. (Some variable(s) have 0 variance). This makes me think something other than the lack of within group data variation is going on here.

I attached my test code here. The two test datasets are the same as before.

Again, thanks a lot for your help here. We really appreciate it.

On Wed, May 11, 2022 at 4:33 AM Phillip Alday @.***> wrote:
I took a look at the original example, re-written here a little bit to make it easy to run anywhere:

using CategoricalArrays, CSV, DataFrames, Downloads, MixedModels

model_ready_url = "https://github.com/JuliaStats/MixedModels.jl/files/8606358/model_ready_data_small.csv";

model_ready_data = CSV.read(Downloads.download(model_ready_url), DataFrame);

weights_url = "https://github.com/JuliaStats/MixedModels.jl/files/8606359/weights.csv";

weights = CSV.read(Downloads.download(weights_url), DataFrame);

wts = weights[!, :weight_vector];

contrasts = Dict(
:brand_category_channel_type => Grouping(),

:channel_type_sub_brand => Grouping(),
);

model_formula = @formula(
volume ~ 1 + base_price + ppi +

zerocorr(1 + ppi | brand_category_channel_type) +

zerocorr(1 + base_price | channel_type_sub_brand)
);

model = fit(MixedModel, model_formula, model_ready_data; wts, contrasts)

@dmbates https://github.com/dmbates I inserted a @show into the Cholesky factorization to get a peek at what's going wrong and found that the Cholesky is failing on this matrix:

4×4 Matrix{Float64}:
 -2.09724e7  34383.4    29916.1    -78929.7
-34369.8 -1334.4 -10224.6 -1.22374e5

-29893.4 10223.1 -3655.35 -163610.0

78803.2 -5724.27 -3884.24 2.10496e6

which makes sense because the there are negative elements on the main diagonal. I have the distinct feeling that the data is ill conditioned because when I look at the trace the values fluctuate wildly (even taking into account that there are multiple dense blocks interleaved in the trace): Trace of A for each iteration

A = [275.3886110000001 0.0; 0.0 1.0]

A = [26970.515544 10.51829954195982; 10.51829954195982 1.91260431352049]

A = [1.0637780153520002e6 -1983.2690986188613; -1983.2690986188613 39493.976091712946]

A = [2.619361352482001e6 4282.43157226412; 4282.43157226412 14092.787388238297]

A = [1.1280034164830002e6 -820.6784915586895; -820.6784915586895 4964.560800495339]

A = [452452.00636199996 458.1541350402469; 458.1541350402469 182.14373844265293]

A = [146040.54100599996 0.0; 0.0 1.0]

A = [276.264311 -0.9595332129561879; -0.9595332129561879 1.0114882822320845]

A = [26649.671372999997 -38.137774279361565; -38.137774279361565 441.7677118764671]

A = [84425.33053700002 160.83688134053415; 160.83688134053415 437.0625696066532]

A = [32136.417662999997 72.83844581405658; 72.83844581405658 66.05000869434183]

A = [5000.113452999999 1.3301974740865834; 1.3301974740865834 5.382268134134644]

A = [10535.768357999998 -21.20670241298923; -21.20670241298923 10.733343123026957]

A = [1474.5101679999998 0.1926021326267835; 0.1926021326267835 1.030623304486552]

A = [426872.04782599985 0.0; 0.0 1.0]

A = [583763.3572930001 4394.916133642304; 4394.916133642304 1409.0045891331497]

A = [265773.106901 0.0; 0.0 1.0]

A = [33271.942633000006 50.37014051994413; 50.37014051994413 57.9085242058123]

A = [57222.555044999994 0.0; 0.0 1.0]

A = [357177.7076040001 1172.8537253491788; 1172.8537253491788 1079.6533010106489]

A = [50183.988809999995 64.45332399536164; 64.45332399536164 283.0134061707429]

A = [3335.055819 -8.506364170352963; -8.506364170352963 12.048781925147088]

A = [81.65136700000001 0.0; 0.0 1.0]

A = [72897.05660400001 0.0; 0.0 1.0]

A = [32401.206353999998 -51.38809177007144; -51.38809177007144 2060.8943884799387]

A = [66052.47268000002 -279.5977284783337; -279.5977284783337 799.9971679173707]

A = [21296.132530000003 0.0; 0.0 1.0]

A = [4965.848212999999 4.631638997353715; 4.631638997353715 5.321389023848027]

A = [10797.902261000003 -2.9172777010200157e-6; -2.9172777010200157e-6 12.265142258844865]

A = [318098.473248 -119.54416827951218; -119.54416827951218 2260.6135073521036]

A = [413402.9954079999 816.743595367312; 816.743595367312 1037.5977890090533]

A = [135954.015712 0.0; 0.0 1.0]

A = [23028.755915 -43.51435364839068; -43.51435364839068 52.271541525276774]

A = [48371.125023 177.43938929084382; 177.43938929084382 35.74462004685384]

A = [161097.86235999997 -158.4067215237261; -158.4067215237261 322.78039440234954]

A = [30205.396813000003 44.66981137117053; 44.66981137117053 88.78724475730498]

A = [2428.903709 0.8068184266978822; 0.8068184266978822 9.987568346277232]

A = [328.7040559999999 -9.678899004494923e-8; -9.678899004494923e-8 1.1501483580555325]

A = [20833.750557 1.8484707432409433; 1.8484707432409433 39.95554300955688]

A = [910269.2826580001 635.4270031993026; 635.4270031993026 24693.620906233955]

A = [1.9318545696419997e6 21762.14126398811; 21762.14126398811 23688.10170032355]

A = [878789.201814 1663.5854210923158; 1663.5854210923158 2095.5081350902833]

A = [327982.731103 0.0; 0.0 1.0]

A = [118083.028825 0.0; 0.0 1.0]

A = [181.32147 -0.14700409062397501; -0.14700409062397501 1.0022172930381041]

A = [38605.205266000004 -24.593637932898645; -24.593637932898645 383.92849243894864]

A = [106212.32890999998 266.0865604865825; 266.0865604865825 819.6559327799044]

A = [30016.06241099999 20.80462026606226; 20.80462026606226 68.46164343856141]

A = [13378.679689 32.898730919653055; 32.898730919653055 39.233988452213126]

A = [915.644949 -1.5071896208862512e-7; -1.5071896208862512e-7 1.015078637083418]

A = [252060.020108 0.0; 0.0 1.0]

A = [621346.821625 0.0; 0.0 1.0]

A = [110634.566327 -399.0505372783746; -399.0505372783746 917.4135872131402]

A = [33458.289665 -83.44707236540337; -83.44707236540337 99.67739752424703]

A = [31303.74954499999 322.1356811265711; 322.1356811265711 134.56785309036164]

A = [44449.437801 78.17890526753243; 78.17890526753243 176.95352046140005]

A = [352.891658 -12.388067575368199; -12.388067575368199 5.833122809320677]

A = [41082.49293100001 0.0; 0.0 1.0]

A = [684146.0138419999 -1734.5918498583499; -1734.5918498583499 11857.832365204269]

A = [2.212374529104e6 10704.799646349427; 10704.799646349427 4961.569904861191]

A = [897912.696569 0.0; 0.0 1.0]

A = [58531.954456999985 0.0; 0.0 1.0]

A = [14276.461853000003 157.25932094889245; 157.25932094889245 1682.205993163988]

A = [39136.389089000004 -186.14667314280345; -186.14667314280345 165.61717224568736]

A = [18335.152909999997 -16.083555503698925; -16.083555503698925 48.264775052827126]

A = [1492.5685719999994 -4.813104634873771; -4.813104634873771 3.429814405477891]

A = [4722.642945 0.0; 0.0 1.0]

A = [160237.23470200002 -1362.0332278205956; -1362.0332278205956 17718.075143373797]

A = [256687.56895899994 733.5998921280423; 733.5998921280423 1297.8975853451163]

A = [78935.48453200002 -224.33517204232265; -224.33517204232265 401.18883812787834]

A = [14587.610453999996 5.6183049352525565; 5.6183049352525565 57.78156237884673]

A = [48405.008471 0.0; 0.0 1.0]

A = [237755.12856599994 -44.967348451417664; -44.967348451417664 13.667593218320535]

A = [15306.60698 -36.092268668450366; -36.092268668450366 42.48291261025591]

A = [36.99278399999999 0.0; 0.0 1.0]

A = [40041.533756 0.0; 0.0 1.0]

A = [485862.1159290001 -2296.271618044505; -2296.271618044505 19494.752314392936]

A = [962305.6347670001 -3959.4784579383763; -3959.4784579383763 27479.54673488647]

A = [289543.656614 239.73620923502745; 239.73620923502745 356.38575990025436]

A = [168290.71553000004 0.0; 0.0 1.0]

A = [59696.79739899999 -44.24683646411123; -44.24683646411123 11.032246735569943]

A = [5.436875217993902e6 9133.901880728996; 9133.901928957246 52347.60150044322]

A = [159016.79209236588 1588.1851447894228; 1588.1402983643304 547.151210558372]

A = [1.3683723917538924e6 -5045.438562810177; -5045.4404802184035 74298.59319623487]

A = [483671.1259662385 -189.3308893553409; -189.332343649997 487.88938289093664]

A = [135509.26069072008 894.2430411409549; 894.2393952041767 2469.882954564401]

A = [938851.2978645571 0.0; 0.0 1.0]

A = [214890.0920907084 0.0; 0.0 1.0]

A = [4.1669748043278325e6 6879.517653854432; 6879.517913177157 23010.654891861246]

A = [188389.3024729495 -395.04885835128465; -395.0456722108642 1210.6128692923232]

A = [1.0184112520368345e6 -2659.7095508832504; -2659.7091933260112 6195.288778919975]

A = [117184.938391316 -42.400570378120506; -42.399755180757694 223.2629833582394]

A = [3.8529621874329704e6 -3047.346437564768; -3047.3462568742534 14887.51137805212]

A = [77958.41099777636 -875.2781422778461; -875.2731850342324 3683.82293447747]

A = [558848.7982492348 8392.646972669452; 8392.640106726652 17769.99466421015]

A = [293137.0729294579 -50.57990652048062; -50.57945934756645 107.17627481023055]

A = [1.9656948855745604e6 15332.743855229997; 15332.7442346963 14663.520052449101]

A = [-2.0972427010079682e7 34383.411278323634 29916.105630271988 -78929.67867736067; -34369.8197156927 -1334.404182316827 -10224.62028309605 -122374.31793134139; -29893.376019959225 10223.085489326098 -3655.349680672196 -163610.03849458476; 78803.18837686801 -5724.271553533462 -3884.2415038617037 2.1049649628630173e6]

@yuexiao-marketingattribution https://github.com/yuexiao-marketingattribution There's only so much we can do to not error on ill-conditioned problems. It looks like you're pushing your models a bit beyond what your data support. We discuss this a bit in our documentation https://juliastats.org/MixedModels.jl/stable/rankdeficiency/#Undetected-Rank-Deficiency .

Part of the ill-conditioning here is that the many channel_type_sub_brands only have a single base_price:

julia> sort!(combine(groupby(model_ready_data, :channel_type_sub_brand), :base_price => length ∘ unique => :n_unique_base_price), :n_unique_base_price) 81×2 DataFrame

Row │ channel_type_sub_brand n_unique_base_price
 │ Int64                   Int64
─────┼─────────────────────────────────────────────

1 │ 0 1

2 │ 6 1

3 │ 14 1

4 │ 16 1

5 │ 18 1

6 │ 22 1

7 │ 23 1

8 │ 26 1

9 │ 31 1

10 │ 42 1

11 │ 43 1

12 │ 50 1

13 │ 51 1

14 │ 57 1

15 │ 60 1

16 │ 61 1

17 │ 66 1

18 │ 71 1

19 │ 74 1

20 │ 75 1

21 │ 79 1

22 │ 37 2

23 │ 44 3

24 │ 49 5

25 │ 7 6

26 │ 33 6

27 │ 1 7

28 │ 28 7

29 │ 13 8

⋮ │ ⋮ ⋮

For these channel_type_sub_brand, the random slope will be the same as the population slope. This leads to the shrinkage to zero for the variance component that @dmbates https://github.com/dmbates noted in his Jupyter notebook.

@dmbates https://github.com/dmbates I'm wondering if there is nonetheless a way to catch this and either give a nicer error or include a tolerance that forces one of the earlier iterations to zero.

— Reply to this email directly, view it on GitHub https://github.com/JuliaStats/MixedModels.jl/issues/611#issuecomment-1123432378, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHERARANSBZT4Y5VKPGPTRTVJN5HBANCNFSM5U5IEO4Q . You are receiving this because you were mentioned.Message ID: @.***>

dmbates commented 2 years ago

@yuexiao-marketingattribution Did you intend to make updated data sets available?

yuexiao-marketingattribution commented 2 years ago

data_small2.csv data_small.csv

Above are the data for the test code

yuexiao-marketingattribution commented 2 years ago

Just uploaded the two datasets for the test.jl code. Thanks!

On Thu, May 12, 2022 at 11:24 AM Douglas Bates @.***> wrote:

@yuexiao-marketingattribution https://github.com/yuexiao-marketingattribution Did you intend to make updated data sets available?

— Reply to this email directly, view it on GitHub https://github.com/JuliaStats/MixedModels.jl/issues/611#issuecomment-1125191873, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHERAREDXHAYFJ4ZKCNMNNDVJUWD7ANCNFSM5U5IEO4Q . You are receiving this because you were mentioned.Message ID: @.***>

yuexiao-marketingattribution commented 2 years ago

Hi, Dr. Bates and Mr. Alday - Just want to check in and see if you have got everything you need to look into this. Also, what is your assessment of the issue? Do you agree with me that this is related to some changes in the algorithm rather than data? And if you do, do you think you will have some time to look into and fix it in the near future?

We would really appreciate your help on this. Please let me know if there is anything I can do to help.

Thanks, Yue

On Thu, May 12, 2022 at 11:47 AM Yue Xiao @.***> wrote:

Just uploaded the two datasets for the test.jl code. Thanks!

On Thu, May 12, 2022 at 11:24 AM Douglas Bates @.***> wrote:

@yuexiao-marketingattribution https://github.com/yuexiao-marketingattribution Did you intend to make updated data sets available?

— Reply to this email directly, view it on GitHub https://github.com/JuliaStats/MixedModels.jl/issues/611#issuecomment-1125191873, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHERAREDXHAYFJ4ZKCNMNNDVJUWD7ANCNFSM5U5IEO4Q . You are receiving this because you were mentioned.Message ID: @.***>

palday commented 2 years ago

@yuexiao-marketingattribution There is an open PR that addresses some of the problem (#615). However, my previous comments about this being a difficult dataset still hold. The model you want to fit just isn't well suited to the data in question. The PR helps the software to recover and to still fit a model, but I can't guarantee that it will fix future problems of this type.

The problem with multiple grouping variables is a symptom of the model being more complex than the data support (both my and @dmbates' previous comments touch upon this).

You can try out the fix yourself:

julia> ]
pkg> add MixedModels#pa/rescale

yuexiao-marketingattribution commented 2 years ago

Hi, Dr. Bates/Philip - Thanks a lot for looking into this. I have been testing the new code on various models we have run. The fix definitely allowed a lot more models to pass and it will fail when I start to push my model too much. One question -

I noticed that while the model runs to find the best solution, it does keep updating the variance component of the model. To help us identify the data issue and speed up our modeling iterations, I wonder if I could leverage the updated variance components and remove the random interaction that has the smallest variance one at a time to find an estimable model. I would like to make sure such a process makes sense to you as you know the updating process of the variance components in the model estimation process.

Again, really appreciate all the help from you over the years using this package!

Yue

On Fri, May 20, 2022 at 11:05 AM Phillip Alday @.***> wrote:

Closed #611 https://github.com/JuliaStats/MixedModels.jl/issues/611 as completed via #615 https://github.com/JuliaStats/MixedModels.jl/pull/615 .

— Reply to this email directly, view it on GitHub https://github.com/JuliaStats/MixedModels.jl/issues/611#event-6651906443, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHERARA7ZPVVCGZVXSFUBSTVK6Z4ZANCNFSM5U5IEO4Q . You are receiving this because you were mentioned.Message ID: @.***>

palday commented 2 years ago

I noticed that while the model runs to find the best solution, it does keep updating the variance component of the model. To help us identify the data issue and speed up our modeling iterations, I wonder if I could leverage the updated variance components and remove the random interaction that has the smallest variance one at a time to find an estimable model. I would like to make sure such a process makes sense to you as you know the updating process of the variance components in the model estimation process.

If you mean, "can we do this as part of the fitting process?", then the answer is no.
If you mean, "can we perform backward model selection on the variance components?", then the answer is yes.
If you mean, "can I remove variance components based on domain knowledge/expertise when a complex model fails to fit and then see if the simpler model works?", then the answer is yes, and that's what @dmbates and I were suggesting.

yuexiao-marketingattribution commented 1 year ago

Thanks a lot for this check on the data. I checked the second data case and one of the variables also has no variation in value in many groups. However, it is quite common for us to have some variables to have limited variations in some groups. We don't want to exclude these groups as other variables in these groups could help improve the estimation. But we are fine if the model comes back and says, "Sorry, random effects for this variable are all 0 due to data. You only have a fixed effect for this variable."

In both cases, we have two random interaction terms. And one of the interaction levels is nested within the other. So I tested a model where all the independent variables have random interaction with the most granular level. The model passes in both cases. (Some variable(s) have 0 variance). This makes me think something other than the lack of within group data variation is going on here.

I attached my test code here. The two test datasets are the same as before.

Again, thanks a lot for your help here. We really appreciate it.

On Wed, May 11, 2022 at 4:33 AM Phillip Alday @.***> wrote:

I took a look at the original example, re-written here a little bit to make it easy to run anywhere:

using CategoricalArrays, CSV, DataFrames, Downloads, MixedModels

model_ready_url = "https://github.com/JuliaStats/MixedModels.jl/files/8606358/model_ready_data_small.csv";

model_ready_data = CSV.read(Downloads.download(model_ready_url), DataFrame);

weights_url = "https://github.com/JuliaStats/MixedModels.jl/files/8606359/weights.csv";

weights = CSV.read(Downloads.download(weights_url), DataFrame);

wts = weights[!, :weight_vector];

contrasts = Dict(
:brand_category_channel_type => Grouping(),

:channel_type_sub_brand => Grouping(),
);

model_formula = @formula(
volume ~ 1 + base_price + ppi +

zerocorr(1 + ppi | brand_category_channel_type) +

zerocorr(1 + base_price | channel_type_sub_brand)
);

model = fit(MixedModel, model_formula, model_ready_data; wts, contrasts)

@dmbates https://github.com/dmbates I inserted a @show into the Cholesky factorization to get a peek at what's going wrong and found that the Cholesky is failing on this matrix:

4×4 Matrix{Float64}:
 -2.09724e7  34383.4    29916.1    -78929.7
-34369.8 -1334.4 -10224.6 -1.22374e5

-29893.4 10223.1 -3655.35 -163610.0

78803.2 -5724.27 -3884.24 2.10496e6

which makes sense because the there are negative elements on the main diagonal. I have the distinct feeling that the data is ill conditioned because when I look at the trace the values fluctuate wildly (even taking into account that there are multiple dense blocks interleaved in the trace): Trace of A for each iteration

A = [275.3886110000001 0.0; 0.0 1.0]

A = [26970.515544 10.51829954195982; 10.51829954195982 1.91260431352049]

A = [1.0637780153520002e6 -1983.2690986188613; -1983.2690986188613 39493.976091712946]

A = [2.619361352482001e6 4282.43157226412; 4282.43157226412 14092.787388238297]

A = [1.1280034164830002e6 -820.6784915586895; -820.6784915586895 4964.560800495339]

A = [452452.00636199996 458.1541350402469; 458.1541350402469 182.14373844265293]

A = [146040.54100599996 0.0; 0.0 1.0]

A = [276.264311 -0.9595332129561879; -0.9595332129561879 1.0114882822320845]

A = [26649.671372999997 -38.137774279361565; -38.137774279361565 441.7677118764671]

A = [84425.33053700002 160.83688134053415; 160.83688134053415 437.0625696066532]

A = [32136.417662999997 72.83844581405658; 72.83844581405658 66.05000869434183]

A = [5000.113452999999 1.3301974740865834; 1.3301974740865834 5.382268134134644]

A = [10535.768357999998 -21.20670241298923; -21.20670241298923 10.733343123026957]

A = [1474.5101679999998 0.1926021326267835; 0.1926021326267835 1.030623304486552]

A = [426872.04782599985 0.0; 0.0 1.0]

A = [583763.3572930001 4394.916133642304; 4394.916133642304 1409.0045891331497]

A = [265773.106901 0.0; 0.0 1.0]

A = [33271.942633000006 50.37014051994413; 50.37014051994413 57.9085242058123]

A = [57222.555044999994 0.0; 0.0 1.0]

A = [357177.7076040001 1172.8537253491788; 1172.8537253491788 1079.6533010106489]

A = [50183.988809999995 64.45332399536164; 64.45332399536164 283.0134061707429]

A = [3335.055819 -8.506364170352963; -8.506364170352963 12.048781925147088]

A = [81.65136700000001 0.0; 0.0 1.0]

A = [72897.05660400001 0.0; 0.0 1.0]

A = [32401.206353999998 -51.38809177007144; -51.38809177007144 2060.8943884799387]

A = [66052.47268000002 -279.5977284783337; -279.5977284783337 799.9971679173707]

A = [21296.132530000003 0.0; 0.0 1.0]

A = [4965.848212999999 4.631638997353715; 4.631638997353715 5.321389023848027]

A = [10797.902261000003 -2.9172777010200157e-6; -2.9172777010200157e-6 12.265142258844865]

A = [318098.473248 -119.54416827951218; -119.54416827951218 2260.6135073521036]

A = [413402.9954079999 816.743595367312; 816.743595367312 1037.5977890090533]

A = [135954.015712 0.0; 0.0 1.0]

A = [23028.755915 -43.51435364839068; -43.51435364839068 52.271541525276774]

A = [48371.125023 177.43938929084382; 177.43938929084382 35.74462004685384]

A = [161097.86235999997 -158.4067215237261; -158.4067215237261 322.78039440234954]

A = [30205.396813000003 44.66981137117053; 44.66981137117053 88.78724475730498]

A = [2428.903709 0.8068184266978822; 0.8068184266978822 9.987568346277232]

A = [328.7040559999999 -9.678899004494923e-8; -9.678899004494923e-8 1.1501483580555325]

A = [20833.750557 1.8484707432409433; 1.8484707432409433 39.95554300955688]

A = [910269.2826580001 635.4270031993026; 635.4270031993026 24693.620906233955]

A = [1.9318545696419997e6 21762.14126398811; 21762.14126398811 23688.10170032355]

A = [878789.201814 1663.5854210923158; 1663.5854210923158 2095.5081350902833]

A = [327982.731103 0.0; 0.0 1.0]

A = [118083.028825 0.0; 0.0 1.0]

A = [181.32147 -0.14700409062397501; -0.14700409062397501 1.0022172930381041]

A = [38605.205266000004 -24.593637932898645; -24.593637932898645 383.92849243894864]

A = [106212.32890999998 266.0865604865825; 266.0865604865825 819.6559327799044]

A = [30016.06241099999 20.80462026606226; 20.80462026606226 68.46164343856141]

A = [13378.679689 32.898730919653055; 32.898730919653055 39.233988452213126]

A = [915.644949 -1.5071896208862512e-7; -1.5071896208862512e-7 1.015078637083418]

A = [252060.020108 0.0; 0.0 1.0]

A = [621346.821625 0.0; 0.0 1.0]

A = [110634.566327 -399.0505372783746; -399.0505372783746 917.4135872131402]

A = [33458.289665 -83.44707236540337; -83.44707236540337 99.67739752424703]

A = [31303.74954499999 322.1356811265711; 322.1356811265711 134.56785309036164]

A = [44449.437801 78.17890526753243; 78.17890526753243 176.95352046140005]

A = [352.891658 -12.388067575368199; -12.388067575368199 5.833122809320677]

A = [41082.49293100001 0.0; 0.0 1.0]

A = [684146.0138419999 -1734.5918498583499; -1734.5918498583499 11857.832365204269]

A = [2.212374529104e6 10704.799646349427; 10704.799646349427 4961.569904861191]

A = [897912.696569 0.0; 0.0 1.0]

A = [58531.954456999985 0.0; 0.0 1.0]

A = [14276.461853000003 157.25932094889245; 157.25932094889245 1682.205993163988]

A = [39136.389089000004 -186.14667314280345; -186.14667314280345 165.61717224568736]

A = [18335.152909999997 -16.083555503698925; -16.083555503698925 48.264775052827126]

A = [1492.5685719999994 -4.813104634873771; -4.813104634873771 3.429814405477891]

A = [4722.642945 0.0; 0.0 1.0]

A = [160237.23470200002 -1362.0332278205956; -1362.0332278205956 17718.075143373797]

A = [256687.56895899994 733.5998921280423; 733.5998921280423 1297.8975853451163]

A = [78935.48453200002 -224.33517204232265; -224.33517204232265 401.18883812787834]

A = [14587.610453999996 5.6183049352525565; 5.6183049352525565 57.78156237884673]

A = [48405.008471 0.0; 0.0 1.0]

A = [237755.12856599994 -44.967348451417664; -44.967348451417664 13.667593218320535]

A = [15306.60698 -36.092268668450366; -36.092268668450366 42.48291261025591]

A = [36.99278399999999 0.0; 0.0 1.0]

A = [40041.533756 0.0; 0.0 1.0]

A = [485862.1159290001 -2296.271618044505; -2296.271618044505 19494.752314392936]

A = [962305.6347670001 -3959.4784579383763; -3959.4784579383763 27479.54673488647]

A = [289543.656614 239.73620923502745; 239.73620923502745 356.38575990025436]

A = [168290.71553000004 0.0; 0.0 1.0]

A = [59696.79739899999 -44.24683646411123; -44.24683646411123 11.032246735569943]

A = [5.436875217993902e6 9133.901880728996; 9133.901928957246 52347.60150044322]

A = [159016.79209236588 1588.1851447894228; 1588.1402983643304 547.151210558372]

A = [1.3683723917538924e6 -5045.438562810177; -5045.4404802184035 74298.59319623487]

A = [483671.1259662385 -189.3308893553409; -189.332343649997 487.88938289093664]

A = [135509.26069072008 894.2430411409549; 894.2393952041767 2469.882954564401]

A = [938851.2978645571 0.0; 0.0 1.0]

A = [214890.0920907084 0.0; 0.0 1.0]

A = [4.1669748043278325e6 6879.517653854432; 6879.517913177157 23010.654891861246]

A = [188389.3024729495 -395.04885835128465; -395.0456722108642 1210.6128692923232]

A = [1.0184112520368345e6 -2659.7095508832504; -2659.7091933260112 6195.288778919975]

A = [117184.938391316 -42.400570378120506; -42.399755180757694 223.2629833582394]

A = [3.8529621874329704e6 -3047.346437564768; -3047.3462568742534 14887.51137805212]

A = [77958.41099777636 -875.2781422778461; -875.2731850342324 3683.82293447747]

A = [558848.7982492348 8392.646972669452; 8392.640106726652 17769.99466421015]

A = [293137.0729294579 -50.57990652048062; -50.57945934756645 107.17627481023055]

A = [1.9656948855745604e6 15332.743855229997; 15332.7442346963 14663.520052449101]

A = [-2.0972427010079682e7 34383.411278323634 29916.105630271988 -78929.67867736067; -34369.8197156927 -1334.404182316827 -10224.62028309605 -122374.31793134139; -29893.376019959225 10223.085489326098 -3655.349680672196 -163610.03849458476; 78803.18837686801 -5724.271553533462 -3884.2415038617037 2.1049649628630173e6]

@yuexiao-marketingattribution https://github.com/yuexiao-marketingattribution There's only so much we can do to not error on ill-conditioned problems. It looks like you're pushing your models a bit beyond what your data support. We discuss this a bit in our documentation https://juliastats.org/MixedModels.jl/stable/rankdeficiency/#Undetected-Rank-Deficiency .

Part of the ill-conditioning here is that the many channel_type_sub_brands only have a single base_price:

julia> sort!(combine(groupby(model_ready_data, :channel_type_sub_brand), :base_price => length ∘ unique => :n_unique_base_price), :n_unique_base_price) 81×2 DataFrame

Row │ channel_type_sub_brand n_unique_base_price
 │ Int64                   Int64
─────┼─────────────────────────────────────────────

1 │ 0 1

2 │ 6 1

3 │ 14 1

4 │ 16 1

5 │ 18 1

6 │ 22 1

7 │ 23 1

8 │ 26 1

9 │ 31 1

10 │ 42 1

11 │ 43 1

12 │ 50 1

13 │ 51 1

14 │ 57 1

15 │ 60 1

16 │ 61 1

17 │ 66 1

18 │ 71 1

19 │ 74 1

20 │ 75 1

21 │ 79 1

22 │ 37 2

23 │ 44 3

24 │ 49 5

25 │ 7 6

26 │ 33 6

27 │ 1 7

28 │ 28 7

29 │ 13 8

⋮ │ ⋮ ⋮

For these channel_type_sub_brand, the random slope will be the same as the population slope. This leads to the shrinkage to zero for the variance component that @dmbates https://github.com/dmbates noted in his Jupyter notebook.

@dmbates https://github.com/dmbates I'm wondering if there is nonetheless a way to catch this and either give a nicer error or include a tolerance that forces one of the earlier iterations to zero.

— Reply to this email directly, view it on GitHub https://github.com/JuliaStats/MixedModels.jl/issues/611#issuecomment-1123432378, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHERARANSBZT4Y5VKPGPTRTVJN5HBANCNFSM5U5IEO4Q . You are receiving this because you were mentioned.Message ID: @.***>

yuexiao-marketingattribution commented 1 year ago

So I tested a few models and here is my thought - If I ran your m3, I will see no variance for the intercept. This is consistent with the way data is constructed. Yet the model finished gracefully, all coefficients estimated, etc. Having the slope AND intercept interaction at the same time for ppi or base_price, one at a time, also works, with intercept variance at 0 for all these models. So if the only issue is that intercepts don't vary across interaction levels, shouldn't the model exit gracefully when I have the full model spec below? This used to be the case in older versions and it will greatly help us with our production batch process.

volume ~ 1 + base_price + ppi +

(1 | brand_category_channel_type) + ((0 + ppi) |

brand_category_channel_type) + (1 | channel_type_sub_brand) + ((0 + base_price) | channel_type_sub_brand)

On Tue, May 3, 2022 at 1:46 PM Yue Xiao @.***> wrote:

I just tried the last model you tested - It will fail on me as well. I will test your code on my model and let you know......
volume ~ 1 + base_price + ppi +

(1 | brand_category_channel_type) + ((0 + ppi) |
brand_category_channel_type) + (1 | channel_type_sub_brand) + ((0 + base_price) | channel_type_sub_brand)

On Tue, May 3, 2022 at 1:30 PM Yue Xiao @.***> wrote:
Thanks for the quick action on this! You are right that several minor tweaks of the model will make it work. But the one we would like to fit would fail. it has a random interaction for base_price as well.
volume ~ 1 + base_price + ppi +

(1 | brand_category_channel_type) + ((0 + ppi) |
brand_category_channel_type) + (1 | channel_type_sub_brand) + ((0 + base_price) | channel_type_sub_brand)

On Tue, May 3, 2022 at 1:12 PM Douglas Bates @.***> wrote:

Look at https://gist.github.com/dmbates/5b99dcbc3c3cc19b19a7c09a315e1fe3

— Reply to this email directly, view it on GitHub https://github.com/JuliaStats/MixedModels.jl/issues/611#issuecomment-1116407909, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHERARBR7O3YCXIWH4BUCJTVIFUCLANCNFSM5U5IEO4Q . You are receiving this because you authored the thread.Message ID: @.***>

JuliaStats / MixedModels.jl

LoadError: PosDefException: matrix is not positive definite; Cholesky factorization failed #611