jeremylhour / CIC-asymptotics

Codes for Change-in-change Asymptotics project
Other
0 stars 1 forks source link

Simulations results: good, bad, ugly #5

Open jeremylhour opened 3 years ago

jeremylhour commented 3 years ago

Configuration: nb_simu: 1000 # Nb. of simulations sample_size: 10000 # Sample size lambda_x: .9 # Parameter of exponential distribution for X lambda_z: 1 # Parameter of exponential distribution for Z alpha_y: 5 # Parameter of Pareto distribution for Y

b_2+d_2 = .3

Résultats: Theta_0: 2.22

bias:

MAE:

RMSE:

Coverage rate:

jeremylhour commented 3 years ago

Par contre avec une taille d'échantillon de 1000 aucun problème de taux de couverture...

Configuration: nb_simu: 1000 # Nb. of simulations sample_size: 1000 # Sample size lambda_x: .9 # Parameter of exponential distribution for X lambda_z: 1 # Parameter of exponential distribution for Z alpha_y: 5 # Parameter of Pareto distribution for Y

b_2+d_2 = .3

Résultats: Theta_0: 2.22

bias:

MAE:

RMSE:

Coverage rate:

jeremylhour commented 3 years ago

Configuration: nb_simu: 1000 # Nb. of simulations sample_size: 100 # Sample size lambda_x: .9 # Parameter of exponential distribution for X lambda_z: 1 # Parameter of exponential distribution for Z alpha_y: 5 # Parameter of Pareto distribution for Y

b_2+d_2 = .3

Résultats: Theta_0: 2.22

bias:

MAE:

RMSE:

Coverage rate:

jeremylhour commented 3 years ago

Impression

jeremylhour commented 3 years ago

EDIT: LES RESULTATS NE SONT PAS CORRECTS (PROBLEME DE PARAMETRISATION DE LA LOI EXPONENTIELLE).

Config used:

nb_simu: 10000 # Nb. of simulations sample_size: [100, 200, 500, 1000] # Sample size, can be an array of multiple values lambda_x: [.4, .5, .6, .7, .8] # Parameter of exponential distribution for X lambda_z: [1] # Parameter of exponential distribution for Z alpha_y: [2, 3, 4, 5, 8] # Parameter of Pareto distribution for Y

Runtime: about 2h on SSPdatacould

CICAsymptotics_HugeMCTable.pdf

Commentaires de Xavier:

jeremylhour commented 3 years ago

Nouveau résultats:

Config used:

nb_simu: 10000 # Nb. of simulations sample_size: [100, 200, 500, 1000] # Sample size, can be an array of multiple values lambda_x: [.2, .3, .5, .8, .9] # Parameter of exponential distribution for X lambda_z: [1] # Parameter of exponential distribution for Z alpha_y: [1.5, 2, 3, 4, 6, 7, 10] # Parameter of Pareto distribution for Y

CICAsymptotics_HugeMCTable.pdf

Commentaires de Xavier: ce sont des bonnes nouvelles, tout ça. Tes résultats sont peut-être moins "excitants" comme tu dis, mais en effet on les comprend beaucoup mieux !

jeremylhour commented 3 years ago

nb_simu: 10000 # Nb. of simulations sample_size: [100, 500, 1000, 2000] # Sample size, can be an array of multiple values lambda_x: [.2, .3, .5, .8, .9] # Parameter of exponential distribution for X lambda_z: [1] # Parameter of exponential distribution for Z alpha_y: [1.5, 2, 3, 4, 6, 7, 10] # Parameter of Pareto distribution for Y

CICAsymptotics_HugeMCTable.pdf

jeremylhour commented 3 years ago

Même chose que précédemment mais avec l'estimateur de la variance calculé à la Lewbel Schennach

CIC_asymptotics.pdf

jeremylhour commented 3 years ago

nb_simu: 10000 # Nb. of simulations sample_size: [100, 500, 1000, 2000] # Sample size, can be an array of multiple values lambda_x: [.2, .3, .5, .8, .9] # Parameter of exponential distribution for X lambda_z: [1] # Parameter of exponential distribution for Z alpha_y: [1.5, 2, 3, 4, 6, 7, 10] # Parameter of Pareto distribution for Y

Estimateurs:

Rappel: la FDR de Z est utilisé pour le calcul de \hat U = \hat F_Z(X). L'estimateur de f_Y(.) n'entre en comtpe que dans le calcul de la variance.

Attention: le code a mis 1,5 jours à tourner environ.

CICAsymptotics_HugeMCTable.pdf

jeremylhour commented 3 years ago

Nouveaux résultats avec deux DGP, et en affichant la taille des intervalles de confiance. J'ai réduit le nombre de tirage à 5000 et je n'ai pas fait pour des tailles d'échantillon égales à 2000 pour limiter le temps de calcul.

1) DGP "habituel". Ce sont les mêmes résultats que précédemment, simplement on peut y voir la taille des IC. On voit que "smooth_kernel" a un gros problème. Sur tous les DGP.

2) Le nouveau DGP avec des lois normales. On a:

CICAsymptotics_HugeMCTable.pdf

jeremylhour commented 3 years ago

There was an overflow with estimator 'smooth_kernel' especially when considering the Exponential DGP. This is not a bug. The standard error is very large, and it comes from the 'inverse of the density' terms that goes into the standard error. With that DGP, some points are very extreme, so the density estimator becomes very close to zero at the right tail, which becomes a problem when a \hat U is large and falls, e.g., between two extreme values.

Code points for the problem :

jeremylhour commented 3 years ago

We tried the following config for the Gaussian DGP :

nb_simu: 100 # Nb. of simulations sample_size: [1000, 10000, 20000] # Sample size, can be an array of multiple values mu_x: [0, 3, 5] # mean of X (Gaussian distribution) variance_x: [.5] # Variance of X (Gaussian distribution)

When mu_X = 5 and sample_size is 20000, the code crashes as there seem to be a lot of overflow. Results are reported in the document for the rest. For mu_X = 3, results on coverage rate seems to go in the right direction as the sample size increases. Although it seems that to get close to .95, a lot of data is required. We also tried with a sample size of 100,000 but even one iteration would not finish in a day.

DGP_Gaussian_Large_Sample.pdf

jeremylhour commented 3 years ago

heatmap

jeremylhour commented 3 years ago

heatmap