Closed wlandau closed 6 years ago
Hi @wlandau-lilly
Sorry about this, and thanks for the great reprex. The scratch
changes were working around another, more insidious race condition (if tempdir()
and the storr are on different filesystems - which is the case on many systems then write-copy pattern is nonatomic so two processes can copy to the same file with unpleasant results). I had a shot at fixing is on the rds_separate_scratch
branch but that doesn't fix your case. I have another idea I'll try and implement tonight
Can you please test: https://github.com/richfitz/storr/tree/rds_separate_scratch2
It was really hard to reproduce this race condition, so I am glad you like the example. Thanks for looking into it further.
I am inundated with several other things at the moment, and I will get to rds_separate_scratch2
as soon as I can.
I hope you do not have to resort to file locking, but from recent experience, I do not think it is so bad. I use filelock
now for txtq
, which in turn manages the interprocess communication for drake
.
No file locking necessary, thankfully
The race condition appears free from the reprex when I run it with rds_separate_scratch2
, and I am applying it to drake
's long HPC tests right now. I will let you know what happens.
Great, thanks!
Until rds_separate_scratch2
, drake
's long-running non-CRAN HPC tests had occasional intermittent "cannot open connection" failures on Windows. Now, those same tests are clean. Thanks so much for the patch! Are you ready to merge it into master
? I'm not holding my breath, but I'm eager for the next release.
Thanks for testing Will - I will merge this into master but I don't think I can do a CRAN release for another ~month at least lest I raise the ire of the cran-keepers. If this is a major problem please let me know and I'll take the risk
Sounds great. I do not think the problem is urgent enough to violate the official CRAN submission frequency guidelines, I will just refer people to development storr
if they raise the issue.
Great, thanks! It's merged into master now and the version number for development has been bumped to 1.2.1 (only 1.2.0 should be affected by this behaviour I believe).
Do you have plans to release version 1.2.1 to CRAN as is, or are there other features you plan to work on first? I realize that the patch for this issue is small.
Thanks for the reminder Will
Sent to CRAN (sorry for the delay). Fingers crossed
And on CRAN :tada:
Thanks, Rich!
One of the things I love about
storr
, especially fordrake
, is the apparent thread safety of the RDS driver. As I understand it, different processes should be able to write to different keys or namespaces in parallel, and I heavily rely on this capability. Unfortunately, there is a race condition when the data objects are the same across processes/keys.This race condition seems to be responsible for most of the strange behavior in https://github.com/ropensci/drake/issues/404. I am sorry I did not discover it before
storr
1.2.0 was released to CRAN.