vinecopulib / rvinecopulib

R interface to the vinecopulib C++ library
GNU General Public License v3.0
34 stars 9 forks source link

the selection of best-fit model #272

Closed chienyutseng closed 6 months ago

chienyutseng commented 6 months ago

hello,

I am trying to find the best-fit model by different criteria and family sets. The following are two models with different terms for fitting, which are named Model1 and Model2.

For two different models, the AIC, BIC, and log-likelihood values suggest that Model2 provides a better fit. Specifically, it shows significantly lower AIC and BIC values along with a higher log-likelihood, which typically would lead one to prefer this model over the other.

However, upon examining the contour plots, Model1 appears to offer a more reasonable capture of the dependence structure between the variables, with more balanced and symmetric contours. Model2, while numerically superior, produces contour plots that show extreme tail dependence, which may not align with the theoretical expectations or the nature of the data.

The discrepancy between the numerical model selection criteria and the qualitative assessment of the contour plots is puzzling. Given that the model selection criteria typically guide the choice of the model, the contour plot, as a visual representation of the dependency structure, is equally important for confirming the appropriateness of the model for the data.

I am unsure if this is an issue with the package's implementation, the model fitting procedure, or my understanding of the expected output. Could there be potential aspects of the model selection process that are not captured by AIC, BIC, and log-likelihood values? Or should I consider other diagnostics to reconcile the numerical criteria with the visual observations?

Any insights, suggestions, or guidance on how to proceed with model selection in this case would be greatly appreciated.


Model1

截圖 2024-03-16 下午12 03 39

Model2

截圖 2024-03-16 下午12 11 10
tnagler commented 6 months ago

Hi, this seems to be a numerical issue with the BB7 family. The PDF values in the far tail can't be computed accurately which lead to the weird plot (and unusally high likelihood) in your example.

You can install a stabilized version of the library with:

remotes::install_github("vinecopulib/rvinecopulib@fix-bb7")

The fix will be included in the next official release.

The strong tail dependence will probably prevail though, since you also see this in the TLL plot.

chienyutseng commented 6 months ago

Thanks a lot!