r-devel / r-project-sprint-2023

Material for the R project sprint
https://contributor.r-project.org/r-project-sprint-2023/
17 stars 3 forks source link

Optimize scalar RNG #67

Closed gmbecker closed 11 months ago

gmbecker commented 1 year ago

Endorsed by Luke

Generating scalar random numbers is very slow because the seed is reallocated after each R-level call. It should be possible to avoid this.

aitap commented 1 year ago

Do I understand it correctly that the idea is to modify PutRNGState as follows?

  1. Look up .Random.seed
  2. If it's already an INTSXP of the right length, not shared, use the existing data structure and prepare to overwrite its contents
  3. Otherwise allocate the R-level variable anew like it was done previously

Or would it be better to make it an ALTREP that references the contents of RNG_Table[RNG_kind] without copying?

ltierney commented 1 year ago

The options I can think of are

Some things to look for:

gmbecker commented 1 year ago

Currently a C-side SEXP copy of .Random.Seed is generated and then assigned to the R side within PutRNGState. This isn't really needed provided that the C-side array of ints becomes the source of truth and then can be copied to .Random.Seed only in rare cases when its necessary.

Remember, .Random.Seed can Be Unbound, so we need the "old way" to run once, then we update RNG_Table.i_seed

This means that if things are aborted in the middle, the old .Random.Seed value will no longer be preserved.

ltierney commented 11 months ago

I've opened this on bugzilla at https://bugs.r-project.org/show_bug.cgi?id=18600