Verify the baseline figures output and code

jlperla commented 1 year ago

Code review to look for errors, mismatched filenames, labels in wrong order, using the wrong experiments, etc.

[x] convert_frequentist_output.jl: Converts the output from the frequentist experiments into a format that can be used by the plotting scripts.
[x] convert_dynare_output.jl: Converts the output from the dynare experiments into a format that can be used by the plotting scripts consistent with the Julia chains.
[x] baseline_figures.jl: Generates all figures except for the RBC robustness examples
[x] rbc_robustness_figures.jl: Generates the RBC robustness figures
[x] baseline_tables.py: Generates all tables except for the RBC frequentist tables
[x] rbc_frequentist_tables.py: Generates the RBC frequentist tables
[x] Go through all figures to make sure we don't have major regressions in quality
[x] The original code used rbc2_joint_200_long in a few places. For example https://github.com/HighDimensionalEconLab/HMCExamples.jl/blob/main/scripts/generate_paper_results/baseline_figures.jl#L205-L207 and https://github.com/HighDimensionalEconLab/HMCExamples.jl/blob/main/scripts/generate_paper_results/baseline_figures.jl#L217
- Go through and make sure you like when/where it uses the long one. I don't really care, so just change the referenced experiments if you prefer
[x] Check SGU pseudotrues between julia and dynare: https://github.com/HighDimensionalEconLab/HMCExamples.jl/blob/main/scripts/generate_paper_results/baseline_figures.jl#L255
[x] James said that he thought something might have an error reordering at some point in the dynare vs. the julia code. Not sure if this is true or not, but created #167 to review the ideas. The density plots look bad, but it might be because the sampling si bad (or that it starts at a particular initial condition away from the pseudotrue, or that thepriors are wrong, etc.).
[x] In https://github.com/HighDimensionalEconLab/HMCExamples.jl/blob/main/scripts/generate_paper_results/rbc_robustness_figures.jl review the pseudos and the yrange and xrange for display. Mess around with labels/figures/captions to your hearts content.
[x] In hhttps://github.com/HighDimensionalEconLab/HMCExamples.jl/blob/main/scripts/generate_paper_results/baseline_tables.py#L83-L84 make sure the number of particles is up to date. Otherwise everything should come from metadata
[x] Verify psuedos/etc. in https://github.com/HighDimensionalEconLab/HMCExamples.jl/blob/main/scripts/generate_paper_results/baseline_tables.py#L71-L74
[x] Change the footnotes as you see fit in https://github.com/HighDimensionalEconLab/HMCExamples.jl/blob/main/scripts/generate_paper_results/baseline_tables.py#L77-L84
[x] https://github.com/HighDimensionalEconLab/HMCExamples.jl/blob/main/scripts/generate_paper_results/rbc_frequentist_tables.py just verify thecode, look for errors, etc.
[ ] You will see that in the paper itself I added in the inferred shcoks of the SGU and the RBC SV. Remove those if you don't like them.
[ ] The new stuff with T=500 is in the Scaling with Sample Length appendix. Check if correct then move around as you see fit. If you want other results, we can add them, but I think this proves the point on how performance scales with N.
[ ] Similarly, there is only a subset of new material in the RBC with Stochastic Volatility section, so add things in as you see fit.
[x] Figure 11 and 12 were moved to the appendix, as discussed. Feel free to change anything on those figures (e.g. https://github.com/HighDimensionalEconLab/HMCExamples.jl/blob/main/scripts/generate_paper_results/rbc_robustness_figures.jl can zoom in, change tittles, resize, etc.

donskerclass commented 1 year ago

Just from looking at the new figures (no code review yet, will do next), things look mostly fine, but a few aesthetic comments:

The legend location in figure 3 obscures the density estimates. Move to top right?
The lack of x axis labels in figures 7 8 9 combined with the multi-row layout makes it look like the title of the graph below is the x axis label for the graph above. Maybe add back the "Sample value" label like we do in Figures 1, 2, 4
I like the pseudotrue line for the SGU plots, where the densities are far enough apart that it's helpful to know which is closer. For the others I don't think it's needed. But maybe hold off until I verify the SGU issue.

I will get to other questions after looking also at the code.

donskerclass commented 1 year ago

Running convert_dynare_output.jl for me fails: "UndefVarError: include_vars not defined". Double check Line #65: should argument be just vars? Ignore this if include_vars somehow gets defined globally elsewhere, though using a global for that seems like bad style for exactly this reason.

jlperla commented 1 year ago

Don't run the code, it won't work without updated runs. I would just do a code review for now

donskerclass commented 1 year ago

Well, that time I checked the fix and ran it and it does indeed appear to be an error.

It looks like you reorganized things and renamed them when splitting from convert_frequentist_output.jl. I think include_vars gets defined as a global in that script so it may still run when in memory, which it may be if running the scripts in the order done generate_paper_results.sh, and it will even be correct for the RBC, but I think it will be incorrect for SGU.

Anyway, I committed the fix. Other ones I will try to just check the code, since I guess you may have MCMC output that differs from what I downloaded from the AWS bucket so I can't rely on results being identical.

jlperla commented 1 year ago

Sounds good. But my suggestion is just not to run the code until we have the final version of the chains. There are things that changed you wouldn't have on your computer.

When you have modified all the constants and done the code review I will rerun absolutely everything (with Julia 1.9) and then do the full execution of the scripts from the commandline shell script and fix any bugs

donskerclass commented 1 year ago

Strictly, for the RBC models, while we list the pseudotrue as 0.2, we actually simulate based on beta = 0.998, which gives beta_{draw}=0.2004008. In the Dynare in contrast, we use the beta_draw parameterization and so it is exactly 0.2. Since for these models, unlike SGU, we use the exact same generated data sets, this difference never affects any calculations. However, it does mean that the value reported in the tables is rounded.

I think this is something we can just ignore, since the rounding error is small relative to sampling/estimation error and goes in the wrong direction to explain why our mean estimates have a slight bias visible in the robustness plots, which is why I was investigating it, though it does explain as much as a third of the (small) positive bias in the T=200 frequentist stats. In that case it's potentially meaningful, so I propose replacing https://github.com/HighDimensionalEconLab/HMCExamples.jl/blob/e7cb4d8e44c3cbd4a25865e13482e59ca51f2177/scripts/generate_paper_results/convert_frequentist_output.jl#L58 with

pseudotrues = Dict(:α => 0.3, :β_draw => 0.2004008, :ρ => 0.9)

and I guess we can do the same in the robustness figures, though it will make them look a bit worse instead of better by making the exact same replacement in:

https://github.com/HighDimensionalEconLab/HMCExamples.jl/blob/e7cb4d8e44c3cbd4a25865e13482e59ca51f2177/scripts/generate_paper_results/rbc_robustness_figures.jl#L53

Update: made the changes: commit 3446ba3

jlperla commented 1 year ago

Thanks @donskerclass , all looks good. I think it is totally fine for you to change all of those files as you see fit. Better if the pseudo is correct to more digits.

If you want to make file changes to even regenerate the data as well, then that is fine with me, you just need to tell me what to rerun. Otherwise I suggest you just make whatver changes you wish as you go, and then I will run things at the end

donskerclass commented 1 year ago

No need to regenerate the data so far: the change was to make evaluation consistent with the existing data rather than the data consistent with the evaluation. But I will continue to make changes and if anything requires regeneration I'll let you know.

donskerclass commented 1 year ago

The code for rbc_robustness_figures,jl looks fine to me, but for some reason when I run it the Dynare columns of the plots are showing up blank. My first guess would just be that you have renamed or moved some of the output files to make it work, but I ran convert_dynare_output.jl and it produced the files in the appropriate directories which should be called by that file, so whatever is causing the issue is not just a new name. If it runs fine on a fresh run for you, absolutely don't worry about it, but I'm flagging that here in case there is something to look at.

Everything else in the figures looks fine: I deleted or moved around a few excess legends to improve visibility, but that's an aesthetic choice we can modify later if needed.

jlperla commented 1 year ago

@donskerclass lets hold off on changing the figures too much if possible until you can run them yourself. You don't have the modified runs, so I don't think it is possible yet. After #168 you should be able to change whatever you want

donskerclass commented 1 year ago

No problem, I wasn't planning on further figure changes.

Also, the python files to generate the tables look fine to me but don't run on my machine because of whatever Python/Pandas installation I have, which doesn't particularly bother me.

In terms of results, I checked the times and the 10-40x speedup (for ESS/sec) we claim for SGU2 for NUTS over Particle MH has decreased to like a 3-5x speedup. Weirdly, this is a mix of our latest particle run being faster and our latest HMC run being less efficient (even though faster in total time, a smaller number of effective samples). Of course, the first order reason to use HMC is quality, since those particle runs aren't producing trustworthy results, but this is a pretty big speed change. My guess is a mix of different machines, plus, what was in my experience the major source of random variability across runs, which is choice of step size in the adaptation period. This could be just the luck of the seed, or it could be due to regenerating the data, and having, as it happens, a data realization that calls for a slightly smaller step size. Either way, I don't think this is too bad, but we will maybe want to shift to emphasizing quality a little bit more than we emphasize speed.

For the RBC second order, we get similar speed ratios of NUTS vs particle MH as in the past version, but this comes from both being faster, maybe due to faster machines? (or luck of latency with cloud compute, etc, etc). I think ESS% got slightly worse for NUTS, so it isn't from better efficiency in this run.

Quality for everything else looks comparable, and the above performance issues seem within normal computational variation and not due to implementation or code issues to be addressed.

I think you can go ahead and run the experiments now if you need to. Anything we want to do with writeups and aesthetics can be modified after the draws. I am still not entirely sure on the RBC_SV settings, but that's the only one that could possibly need to change: if it can be separated out we can keep updating after the other runs are fully finalized. It's quality is good enough that a run at current settings isn't a problem, we just may want to show different things about the model.

HighDimensionalEconLab / HMCExamples.jl

Verify the baseline figures output and code #166