HighDimensionalEconLab / HMCExamples.jl

MIT License
16 stars 3 forks source link

Verify the baseline figures output and code #166

Open jlperla opened 1 year ago

jlperla commented 1 year ago

Code review to look for errors, mismatched filenames, labels in wrong order, using the wrong experiments, etc.

donskerclass commented 1 year ago

Just from looking at the new figures (no code review yet, will do next), things look mostly fine, but a few aesthetic comments:

I will get to other questions after looking also at the code.

donskerclass commented 1 year ago

Running convert_dynare_output.jl for me fails: "UndefVarError: include_vars not defined". Double check Line #65: should argument be just vars? Ignore this if include_vars somehow gets defined globally elsewhere, though using a global for that seems like bad style for exactly this reason.

jlperla commented 1 year ago

Don't run the code, it won't work without updated runs. I would just do a code review for now

donskerclass commented 1 year ago

Well, that time I checked the fix and ran it and it does indeed appear to be an error.

It looks like you reorganized things and renamed them when splitting from convert_frequentist_output.jl. I think include_vars gets defined as a global in that script so it may still run when in memory, which it may be if running the scripts in the order done generate_paper_results.sh, and it will even be correct for the RBC, but I think it will be incorrect for SGU.

Anyway, I committed the fix. Other ones I will try to just check the code, since I guess you may have MCMC output that differs from what I downloaded from the AWS bucket so I can't rely on results being identical.

jlperla commented 1 year ago

Sounds good. But my suggestion is just not to run the code until we have the final version of the chains. There are things that changed you wouldn't have on your computer.

When you have modified all the constants and done the code review I will rerun absolutely everything (with Julia 1.9) and then do the full execution of the scripts from the commandline shell script and fix any bugs

donskerclass commented 1 year ago

Strictly, for the RBC models, while we list the pseudotrue as 0.2, we actually simulate based on beta = 0.998, which gives beta_{draw}=0.2004008. In the Dynare in contrast, we use the beta_draw parameterization and so it is exactly 0.2. Since for these models, unlike SGU, we use the exact same generated data sets, this difference never affects any calculations. However, it does mean that the value reported in the tables is rounded.

I think this is something we can just ignore, since the rounding error is small relative to sampling/estimation error and goes in the wrong direction to explain why our mean estimates have a slight bias visible in the robustness plots, which is why I was investigating it, though it does explain as much as a third of the (small) positive bias in the T=200 frequentist stats. In that case it's potentially meaningful, so I propose replacing https://github.com/HighDimensionalEconLab/HMCExamples.jl/blob/e7cb4d8e44c3cbd4a25865e13482e59ca51f2177/scripts/generate_paper_results/convert_frequentist_output.jl#L58 with

pseudotrues = Dict(:α => 0.3, :β_draw => 0.2004008, :ρ => 0.9)

and I guess we can do the same in the robustness figures, though it will make them look a bit worse instead of better by making the exact same replacement in:

https://github.com/HighDimensionalEconLab/HMCExamples.jl/blob/e7cb4d8e44c3cbd4a25865e13482e59ca51f2177/scripts/generate_paper_results/rbc_robustness_figures.jl#L53

Update: made the changes: commit 3446ba3

jlperla commented 1 year ago

Thanks @donskerclass , all looks good. I think it is totally fine for you to change all of those files as you see fit. Better if the pseudo is correct to more digits.

If you want to make file changes to even regenerate the data as well, then that is fine with me, you just need to tell me what to rerun. Otherwise I suggest you just make whatver changes you wish as you go, and then I will run things at the end

donskerclass commented 1 year ago

No need to regenerate the data so far: the change was to make evaluation consistent with the existing data rather than the data consistent with the evaluation. But I will continue to make changes and if anything requires regeneration I'll let you know.

donskerclass commented 1 year ago

The code for rbc_robustness_figures,jl looks fine to me, but for some reason when I run it the Dynare columns of the plots are showing up blank. My first guess would just be that you have renamed or moved some of the output files to make it work, but I ran convert_dynare_output.jl and it produced the files in the appropriate directories which should be called by that file, so whatever is causing the issue is not just a new name. If it runs fine on a fresh run for you, absolutely don't worry about it, but I'm flagging that here in case there is something to look at.

Everything else in the figures looks fine: I deleted or moved around a few excess legends to improve visibility, but that's an aesthetic choice we can modify later if needed.

jlperla commented 1 year ago

@donskerclass lets hold off on changing the figures too much if possible until you can run them yourself. You don't have the modified runs, so I don't think it is possible yet. After #168 you should be able to change whatever you want

donskerclass commented 1 year ago

No problem, I wasn't planning on further figure changes.

Also, the python files to generate the tables look fine to me but don't run on my machine because of whatever Python/Pandas installation I have, which doesn't particularly bother me.

In terms of results, I checked the times and the 10-40x speedup (for ESS/sec) we claim for SGU2 for NUTS over Particle MH has decreased to like a 3-5x speedup. Weirdly, this is a mix of our latest particle run being faster and our latest HMC run being less efficient (even though faster in total time, a smaller number of effective samples). Of course, the first order reason to use HMC is quality, since those particle runs aren't producing trustworthy results, but this is a pretty big speed change. My guess is a mix of different machines, plus, what was in my experience the major source of random variability across runs, which is choice of step size in the adaptation period. This could be just the luck of the seed, or it could be due to regenerating the data, and having, as it happens, a data realization that calls for a slightly smaller step size. Either way, I don't think this is too bad, but we will maybe want to shift to emphasizing quality a little bit more than we emphasize speed.

For the RBC second order, we get similar speed ratios of NUTS vs particle MH as in the past version, but this comes from both being faster, maybe due to faster machines? (or luck of latency with cloud compute, etc, etc). I think ESS% got slightly worse for NUTS, so it isn't from better efficiency in this run.

Quality for everything else looks comparable, and the above performance issues seem within normal computational variation and not due to implementation or code issues to be addressed.

I think you can go ahead and run the experiments now if you need to. Anything we want to do with writeups and aesthetics can be modified after the draws. I am still not entirely sure on the RBC_SV settings, but that's the only one that could possibly need to change: if it can be separated out we can keep updating after the other runs are fully finalized. It's quality is good enough that a run at current settings isn't a problem, we just may want to show different things about the model.