Open lhupe opened 4 years ago
Oh, this isn't actually new.
This was fixed for precompile caches. #36416 Can we do the same here?
Solutions for small, atomically generated files like precompile caches are not ideal for a few reasons. History files tend to get quite large and they are incrementally generated. Rewriting the entire history file to a temporary file and then replacing it on every REPL entry seems like not a great idea. It would also mean that instead of corruption, we would see history loss: if two Julia processes try to write a history entry at the same time, they both make a copy, add their new entry and then replace the original with their copy — whichever move happens last "wins" and the history entry from the other one is lost.
File locking would be much better: if one Julia process is currently updating the history file, let others wait until it is done. No copying of the file is required and no entries will be lost. The only problem is that file locking is hard and not portable and APIs like flock
don't work on distributed file systems, which is exactly where this is a problem. The best approach is probably to make https://github.com/vtjnash/Pidfile.jl a stdlib and then use that.
I’m currently using julia on a network of Linux computers with shared home directories (via NFS). I’ve noticed that when I use REPLs on two machines simultaneously (which I do more or less every day), the
repl_history.jl
file gets corrupted by null bytes.This is apparently independent of what I do during the simulatenous REPL sessions; I can reproduce the problem by immediately closing the REPLs after opening them. As soon as there are two sessions running at the same time on different hosts, the history file gets corrupted.
Example
This is an example sequence of commands to illustrate the problem, but I have found the problem to be independent of the order of closing or opening the REPLs, as soon as two are running simultaneously, the problem will occur.
On the first machine, open a REPL and run
Then, open a REPL on a second machine an run
Go back to the first machine and execute
thus closing the first REPL.
Now close the second REPL with
The resulting REPL history file (as displayed by
vim
running on the second machine) will be(where the
^@
represent the null bytes)