HenrikBengtsson / parallelly

R package: parallelly - Enhancing the 'parallel' Package
https://parallelly.futureverse.org
128 stars 7 forks source link

makeClusterPSOCK(): Use faster useXDR=FALSE by default #27

Closed HenrikBengtsson closed 3 years ago

HenrikBengtsson commented 3 years ago

Serialization using xdr = FALSE can be significantly faster.

  1. Sketch an outline how to benchmark it for PSOCK workers

  2. Sketch how to test for it, e.g. pass a non-CDR serialized object to the newly opened worker and have it confirm whether or not it can parse the content.

    • For localhost R nodes with the same R installation, we can test for this without launching the worker.
  3. Automate XDR if useXDR=NA. For example, use above algorithm to test. If node supports it, shutdown and recreate will useXDR=FALSE.

HenrikBengtsson commented 3 years ago

Keep the default, but try to add support for useXDR=NA for localhost cluster nodes. When in doubt, use current default, i.e. useXDR=TRUE

HenrikBengtsson commented 3 years ago

I'm probably overthinking it. useXDR=FALSE should be supported by all R architectures. They key here is that most systems these days use little endian, so using useXDR=FALSE will avoid byte shuffling. Since big endian systems are rare, then FALSE is a more sensible default.

HenrikBengtsson commented 3 years ago

Will hold back with this until next-next release, which will become something like 1.22.0. At that point in time, we'll have future 1.20.0 on CRAN, which will depend on 'parallelly'. I turn, that'll mean I'll be able to run 1st and 2nd generation revdepcheck on 'parallelly' and get a lot of packages (~140 on CRAN and Bioc) to validate against.

HenrikBengtsson commented 3 years ago

As a first step, I've updated the default to be useXDR = getOptionOrEnvVar("future.makeNodePSOCK.useXDR", TRUE) so that it can be set via options {future,parallelly}.makeNodePSOCK.useXDR or environment variables R_{FUTURE,PARALLELLY}_MAKENODEPSOCK_USEXDR.

I'll run full parallelly+future revdep checks, before I decided whether or not to switch the default "default" from TRUE to FALSE for the next release.

HenrikBengtsson commented 3 years ago

I've switched the default to useXDR = FALSE.