r-lib / later

Schedule an R function or formula to run after a specified period of time.
https://r-lib.github.io/later
Other
137 stars 27 forks source link

Peformance: dependency on rlang #186

Closed shikokuchuo closed 4 months ago

shikokuchuo commented 4 months ago

Would removing the dependency on rlang be a consideration?

I am adding later as a dependency for nanonext (and hence mirai), and the loading of rlang seems to add up to 100 milliseconds to the startup time of new background processes, with later itself contributing a much lower overhead of about 50 milliseconds. This adds undesirable latency in situations where new single-use processes are generated on the fly.

rlang is currently only used to handle weak references and for it’s rlang::as_function() to translate formula-style functions. It is trivial to re-implement the weakref functions using R’s C interface, and I can either port across or look to re-implement as_function().

I am happy to provide PRs if this approach is agreeable.

wch commented 4 months ago

This sounds interesting. I agree that reimplementing the weakref stuff in C will be simple. However, I suspect that reimplementing rlang::as_function() in a way that is equivalent to rlang's version will be very complicated.

Also, many R packages use rlang, so in practice most R users will still spend time loading rlang.

Here is another possibility: Have you considered working on improving the loading time of later and rlang? Many, many users could benefit from improvements there.

shikokuchuo commented 4 months ago

Thanks. I appreciate that rlang is popular, but mirai is also used in a lot of contexts where this isn't the case and as we are talking about new background processes, any packages loaded represent pure overhead.

I had assumed that the loading time was simply due to loading of the shared object, but I could be wrong. I can certainly look into this.

In any case, I'm glad you're open to accepting performance improvements. I shall be sure to post PRs when I find them. Thank you.

wch commented 4 months ago

I understand that you want to speed things up for mirai and that makes sense to me. I was just pointing out that many more people use later (and rlang) in other contexts. The reason that the later package was created to provide an event loop for shiny and httpuv, and those packages are our main priority. (You can use https://hadley.shinyapps.io/cran-downloads/ to see numbers.)

So, with all of that said, I think there is an easy way to avoid loading rlang. The rlang package is listed in Imports, but in the NAMESPACE file, there is no import(rlang) or importFrom(rlang, ...). That means that the rlang package will not actually be loaded until it is used via rlang::.

There are two places where rlang:: is used:

With these changes, rlang would be loaded only if someone passed in a formula or quosure to later().

shikokuchuo commented 4 months ago

Yes, understood, and I'm using later for completely event-driven promises in mirai to support Shiny (implementing an idea from Joe Cheng).

Your idea is fantastic though and I think would definitely work! Thanks so much for the suggestion. I'll prepare a PR for your review.

shikokuchuo commented 4 months ago

Hi @wch the PR implementing your suggestions is ready for review.

I've tested with nanonext / mirai and confirm that the rlang namespace is no longer automatically loaded, nor the shared object. The startup times for my new processes are consequently reduced.

Thank you again for finding such an easy solution.