JMSLab / eventstudyr

Other
24 stars 1 forks source link

`Numerical optimization failed` for smoothest path in R4.3+ #41

Closed MosesStewart closed 9 months ago

MosesStewart commented 1 year ago

In the attached example code eventstudyr_issue.zip, the smooth path optimization is failing with the example data. Specifically, I see a message that says:

Error in FindCoeffs(res_order, coefficients, inv_covar, Wcritic, pN, order,  : 
  Numerical optimization failed when searching for the smoothest path. Please set 'Smpath' to FALSE.

I am using R 4.3.0 on Windows 11.

code is written below

main <- function() {
    estimates_ols <- EventStudy(
        estimator = "OLS",
        data = example_data,
        outcomevar = "y_jump_m", policyvar = "z",
        idvar      = "id",       timevar   = "t",
        controls = "x_r",
        post = 5, overidpost = 2,
        pre  = 0,  overidpre = 6
    )

    y_breaks = seq(-1, 2, 1)
    plts <- list()

    plts[["smooth_path"]] <- EventStudyPlot(estimates         = estimates_ols,
                                            pre_event_coeffs  = FALSE,
                                            post_event_coeffs = FALSE,
                                            ybreaks           = y_breaks,
                                            smpath            = TRUE) +
                                theme_plots()
}

main()
jmshapir commented 1 year ago

@MosesStewart thanks!

Would you have bandwidth to check whether this issue arises with earlier releases of eventstudyr, or if it's specific to the current release?

Knowing that may help isolate the cause.

jmshapir commented 1 year ago

@MosesStewart actually I had a different thought.

See below for my output when I run the snippet in R 4.2.2.

I don't seem (?) to hit an error in computing the smoothest path. (There is an error with theme_plots but I don't think that's related to eventstudyr.)

Do you want to see what happens if you rewind to R 4.2.2 and try this again? (It's possible that Anaconda might help you maintain multiple versions of R without having to uninstall.)

If at that point you don't get an error then we've likely isolated the issue to a change in base R. If you do get the error then something else must differ between our setups.

(If I have a chance I'll also try updating my R installation to see if I can reproduce the error that way.)

Thank you!

> source("eventstudyr_issue/evenstudyr_issue.r")

Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

    filter, lag

The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union

data.table 1.14.8 using 6 threads (see ?getDTthreads).  Latest news: r-datatable.com

Attaching package: ‘data.table’

The following objects are masked from ‘package:dplyr’:

    between, first, last

Smoothest path note: The lowest order such that a polynomial is in confidence region is 8.
Error in theme_plots() : could not find function "theme_plots"
In addition: Warning messages:
1: package ‘ggplot2’ was built under R version 4.2.3 
2: package ‘ggthemes’ was built under R version 4.2.3 
3: package ‘eventstudyr’ was built under R version 4.2.3 
MosesStewart commented 1 year ago

Do you want to see what happens if you rewind to R 4.2.2 and try this again? (It's possible that Anaconda might help you maintain multiple versions of R without having to uninstall.)

@jmshapir I was able to uninstall and reinstall R 4.2.3 and it worked! I think the function is only failing for R 4.3+. Thank you for the troubleshooting help!

jmshapir commented 1 year ago

Update: When time permits @MosesStewart is going to take a look at the source of the bug.

@MosesStewart when you're ready to start work here please let me know and I will elevate your permissions.

Thanks!

MosesStewart commented 1 year ago

@jmshapir I'm ready to start work here!

jmshapir commented 1 year ago

@MosesStewart you now have write permission to the repo. Use it wisely!

@rcalvo12 @nateschor @ew487 fyi

jmshapir commented 1 year ago

Per call: @MosesStewart will take the lead on debugging and will reach out to @nateschor for assistance if needed.

jmshapir commented 11 months ago

@cindyjulianelu once you accept the invite to join the repository please let me know and I will assign you here thanks!

cindyjulianelu commented 11 months ago

ok I'm in @jmshapir

SimonFreyaldenhoven commented 11 months ago

@nateschor would you still be able to help @cindyjulianelu if she has any questions and potentially review the fix she comes up in this issue?

jmshapir commented 11 months ago

ok I'm in @jmshapir

Welcome @cindyjulianelu!

Can you please review the documents here and pay special attention to to the storage principles?

Once you've reviewed these, please let me know and I can elevate your permissions to allow you to push to the repository.

Thanks!

@SimonFreyaldenhoven fyi

nateschor commented 11 months ago

@SimonFreyaldenhoven hi! Can @santiagohermo review instead (if you're able to) since Santiago wrote the smoothest path code and is more familiar with it?

jmshapir commented 11 months ago

I'm guessing @santiagohermo is tied up for at least the next several weeks. Happy to watch the thread here and do my best to answer questions if I can.

santiagohermo commented 11 months ago

Thanks @nateschor @jmshapir! Indeed, I have some heavy weeks ahead. If this is still open later on and I can be of help, happy to take a look then.

cindyjulianelu commented 11 months ago

Works for me! I'll be on the slow grind since I also suddenly got busy. Nice to e-meet you @santiagohermo

cindyjulianelu commented 11 months ago

Can't create a branch. It'll have a very similar name but with my initials at the end - Just wanted to start playing around on a new branch for cleanliness but my GitHub desktop is freezing up because I can't publish this branch

jmshapir commented 11 months ago

@cindyjulianelu thanks!

Can you please take a look at the principles here and confirm that these docs, and in particular the storage principles, are clear to you?

At that point I can elevate your permissions to allow you to push to the repo.

Thank you!

cindyjulianelu commented 11 months ago

Yea but I'm finding myself quite confused over expectations because I did delve through all the resources, even the long document about social science best practices. It took a while because I was also trying to match that file organization structure to this repo's but looks different. Also Simon told me at out meeting to just start on it as soon as possible with creating a branch.

jmshapir commented 11 months ago

@cindyjulianelu thanks for clarifying! Probably the most important things for you to understand are the storage principles here (including the warning at the bottom) and the workflow here.

If those are clear to you please let me know.

@SimonFreyaldenhoven fyi.

cindyjulianelu commented 11 months ago

Yes I've read these all.

@cindyjulianelu thanks for clarifying! Probably the most important things for you to understand are the storage principles here (including the warning at the bottom) and the workflow here.

If those are clear to you please let me know.

@SimonFreyaldenhoven fyi.

jmshapir commented 11 months ago

@cindyjulianelu your permissions have been elevated! Please use it wisely.

jmshapir commented 10 months ago

Per call, let's:

cindyjulianelu commented 9 months ago

A bit boggled but trying a few things right now as we speak - Pretty much explored increasing the tolerance for the Nelder Mead numeric optimization method in R. But the objective function value (R parameter) that's spit out from the optimization doesn't match the square of the last coefficient of the "par" vector (just a name R gives in its output, see below)

image

jmshapir commented 9 months ago

@cindyjulianelu thanks!

My suggestion would be to look a little more at what the solver is returning, ideally by making some of the plots we described in https://github.com/JMSLab/eventstudyr/issues/41#issuecomment-1775974617.

@SimonFreyaldenhoven fyi.

SimonFreyaldenhoven commented 9 months ago

Just to add to @cindyjulianelu's comment above.

1) We looked into the step-by-step optimization and, as expected, some of the convergence issues seems to arise because during the optimization we have large "offsetting" polynomial terms (e.g. large positive coefficient on ^6 and negative on ^7) that get incrementally updated during each step until a step either fails or the maximum iterations is reached.

2) Most of the convergence errors can be avoided by either a) increasing the tolerance or b) switching to a different solver, but the solutions don't always look very close across solvers

@cindyjulianelu is going to produce a few plot along the lines of this comment next

jmshapir commented 9 months ago

@SimonFreyaldenhoven thanks! That makes sense.

Would it help in a case like this just to set the default maximum polynomial order to something lower (e.g. 5)?

https://github.com/JMSLab/eventstudyr/blob/04691ee6beb5815114f0dabb285794e7291a3bf7/R/AddSmPath.R#L19

jmshapir commented 9 months ago

@SimonFreyaldenhoven another thought I have in this direction is that we could also set an adaptive default that, e.g., sets the maximum polynomial order to some given fraction of the length of the plotting window. That would account for the fact that the polynomial is likely to become naturally more complex with longer windows.

Let me know what you think thanks.

SimonFreyaldenhoven commented 9 months ago

@jmshapir Thanks!

Making the polynomial order a function of the horizon might be a bit awkward: Conceptually, I'd say we want (need) a higher order for the "same smoothness" as we extend the horizon. But computationally my guess is that the problem becomes harder for a fixed order polynomial as we increase the horizon.

Lower polynomials will indeed be easier. So limiting the max_order is one option. Regardless, updating the solver and increasing the tolerance should be helpful too, @cindyjulianelu should have a few numbers ready for us to look at before our meeting later.

cindyjulianelu commented 9 months ago

image [Uploading result_tbl_12-4-23.csv…]()

jmshapir commented 9 months ago

Per call, let's:

SimonFreyaldenhoven commented 9 months ago

@jmshapir @chansen776 @jorpppp

below are the 6 different runs from the table above, with the original on the LHS, and a reduced tolerance on the RHS.

(In the case of the first one - which returned an error before - @cindyjulianelu jsut grabbed the current value at the last iteration before the solver failed out with an error)

They look pretty similar, and a lower tolerance tends to be "better" in a numerical sense than the alternative algorihtms we tried.

So my vote is to just reduce the relative tolerance. @jmshapir @chansen776 @jorpppp let me know if that sounds good to you!

This tolerance is imposed on the square of the highest order polynomial coefficient. In the figure below, we went from 1e-8 to 1e-3. @cindyjulianelu also tried 1e-4, which also converges everywhere and looks pretty much identical to the RHS shown below. Either is fine with me. Perhaps 1e-4 is "nice" because it effectively states that we go until the improvement in the highest order coefficient is less than 1 percent.

image image

@cindyjulianelu Feel free to chime in if I misrepresented any of your results above!

cindyjulianelu commented 9 months ago

All good @SimonFreyaldenhoven and hope one of these solutions is chosen

jmshapir commented 9 months ago

@SimonFreyaldenhoven thanks!

I think you mean increase rather than reduce the (magnitude of) the default tolerance?

In any case I am fine with changing it and agree that 1e-4 sounds like a good benchmark.

@SimonFreyaldenhoven can you make the change and then open a pull request here with @jmshapir as the reviewer?

SimonFreyaldenhoven commented 9 months ago

@jmshapir will do!and yes, I of course meant increase!

@cindyjulianelu On a quick glance, it looks like this would only involve changing lines 117 and 124 in SmPathHelpers.R, also see below: image

Does that sound right to you?

cindyjulianelu commented 9 months ago

Yes that's right @Simon @.***> Get Outlook for iOShttps://aka.ms/o0ukef


From: Simon Freyaldenhoven @.> Sent: Friday, December 15, 2023 4:34:55 PM To: JMSLab/eventstudyr @.> Cc: Cindy Lu @.>; Mention @.> Subject: Re: [JMSLab/eventstudyr] Numerical optimization failed for smoothest path in R4.3+ (Issue #41)

@jmshapirhttps://urldefense.com/v3/__https://github.com/jmshapir__;!!Dq0X2DkFhyF93HkjWTBQKhk!Xm7ZeMlkejGOcDBKEqOps_pCJGZkbWTU1PMEyRzs1RwdqCvsG8f2sTUZv4kE9XNudKcP8mj0Kv75UgW7cNHmvLsF_6S8yS6lzmE4Fw$ will do!and yes, I of course meant increase!

@cindyjulianeluhttps://urldefense.com/v3/__https://github.com/cindyjulianelu__;!!Dq0X2DkFhyF93HkjWTBQKhk!Xm7ZeMlkejGOcDBKEqOps_pCJGZkbWTU1PMEyRzs1RwdqCvsG8f2sTUZv4kE9XNudKcP8mj0Kv75UgW7cNHmvLsF_6S8yS7ZbC66Tg$ On a quick glance, it looks like this would only involve changing lines 117 and 124 in SmPathHelpers.Rhttps://urldefense.com/v3/__https://github.com/JMSLab/eventstudyr/blob/41-smoothest-path-r-4-3-1/R/SmPathHelpers.R__;!!Dq0X2DkFhyF93HkjWTBQKhk!Xm7ZeMlkejGOcDBKEqOps_pCJGZkbWTU1PMEyRzs1RwdqCvsG8f2sTUZv4kE9XNudKcP8mj0Kv75UgW7cNHmvLsF_6S8yS5-vLBjig$, also see below: image.png (view on web)https://urldefense.com/v3/__https://github.com/JMSLab/eventstudyr/assets/20975521/a67f9ba3-9c73-44e8-ab67-51798fe6a22b__;!!Dq0X2DkFhyF93HkjWTBQKhk!Xm7ZeMlkejGOcDBKEqOps_pCJGZkbWTU1PMEyRzs1RwdqCvsG8f2sTUZv4kE9XNudKcP8mj0Kv75UgW7cNHmvLsF_6S8yS6btRviCg$

Does that sound right to you?

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/JMSLab/eventstudyr/issues/41*issuecomment-1858516212__;Iw!!Dq0X2DkFhyF93HkjWTBQKhk!Xm7ZeMlkejGOcDBKEqOps_pCJGZkbWTU1PMEyRzs1RwdqCvsG8f2sTUZv4kE9XNudKcP8mj0Kv75UgW7cNHmvLsF_6S8yS5uEvieaQ$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AE5W6IDQ2DMMQBFCV4HP2DTYJS673AVCNFSM6AAAAAAZPM7CQOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNJYGUYTMMRRGI__;!!Dq0X2DkFhyF93HkjWTBQKhk!Xm7ZeMlkejGOcDBKEqOps_pCJGZkbWTU1PMEyRzs1RwdqCvsG8f2sTUZv4kE9XNudKcP8mj0Kv75UgW7cNHmvLsF_6S8yS7rSF_O-g$. You are receiving this because you were mentioned.Message ID: @.***>

jmshapir commented 9 months ago

Summary: In this issue we explored and resolved a failure of numerical optimization in one of our example cases. The resolution was to loosen the default tolerance on the solver minimand per https://github.com/JMSLab/eventstudyr/issues/41#issuecomment-1856188396,

For details see issue subfolder for merged issue branch and issue subfolder for exploratory issue branch.

Changes brought to main in e0df193.