gphk-metrics / stata-multe

Multiple Treatment Effects
17 stars 4 forks source link

Package Speedup #8

Closed mcaceresb closed 2 years ago

mcaceresb commented 2 years ago

@peterdhull @kolesarm @paulgp Hi all! We have merged #4 to main so I am tagging everyone in this issue to test out the package. We can collect feedback here before publishing.

Once we have dealt with the above two as well as any feedback that comes up through package testing, we can open a follow-up issue to discuss the package description for the SSC submission as well as the text of the statalist announcement.

@jerrayc fyi

mcaceresb commented 2 years ago

@peterdhull For the star data, at the moment we have both the file as part of the package and a link to the repo. Per this comment the plan is to delete the link and just leave the file as part of the package unless you say otherwise in this issue. Note that the file download is currently optional (Stata packages don't install .dta files by default); let me know if you'd like to force the data installation along with the package.

peterdhull commented 2 years ago

thanks @mcaceresb. That sounds good. Let's not force the data installation but let's make sure to note in the help file that folks need to ", all" to get the data in the example

mcaceresb commented 2 years ago

@peterdhull Per our convo, I only took a quick pass at speeding up the package. I was not very successful.

However, I did get one idea: The bulk of the runtime is taken by the decomposition. One way to speed this up would be to move the decomposition to post-estimation, so the main function runs faster, and then if the decomposition is requested the user can wait around for it. LMK what you think of this idea. Barring that I don't think there are easy ways to speed up the runtime.

kolesarm commented 2 years ago

I think moving it to post-estimation is a good solution---one may not care about the decomposition if the initial results don't look good

paulgp commented 2 years ago

Agreed. Great idea.

On Mon, May 16 2022 at 5:34 PM, Michal Kolesar @.***> wrote:

I think moving it to post-estimation is a good solution---one may not care about the decomposition if the initial results don't look good

— Reply to this email directly, view it on GitHub https://github.com/gphk-metrics/stata-multe/issues/8#issuecomment-1128160316, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABDTDFWVVTHPLZPX5HKF2IDVKK5OPANCNFSM5UTMRCOQ . You are receiving this because you were mentioned.Message ID: @.***>

peterdhull commented 2 years ago

Sounds good!

On Mon, May 16, 2022, 20:32 Paul Goldsmith-Pinkham @.***> wrote:

Agreed. Great idea.

On Mon, May 16 2022 at 5:34 PM, Michal Kolesar @.***> wrote:

I think moving it to post-estimation is a good solution---one may not care about the decomposition if the initial results don't look good

— Reply to this email directly, view it on GitHub < https://github.com/gphk-metrics/stata-multe/issues/8#issuecomment-1128160316 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/ABDTDFWVVTHPLZPX5HKF2IDVKK5OPANCNFSM5UTMRCOQ

. You are receiving this because you were mentioned.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/gphk-metrics/stata-multe/issues/8#issuecomment-1128273175, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFFQXLWO5KKRQIEBDFNGKPDVKLSLFANCNFSM5UTMRCOQ . You are receiving this because you were mentioned.Message ID: @.***>

mcaceresb commented 2 years ago

Continued in #9.

mcaceresb commented 2 years ago

In this issue, we explored ways to speed up the package. Other than light improvements we did not find a way to substantially accelerate the computations.

However, we moved the contamination bias decomposition to be computed separately from the main estimation. The decomposition took up the bulk of the runtime, so the main function runs faster when this is omitted. The decomposition can be computed on the fly after estimation or it can be requested before hand. This is documented in the helpfile. (lambda, tau can also be computed on the fly after the main estimation runs.)