teodorlu / hotload

25 stars 3 forks source link

Reload issues #4

Open tingstad opened 2 years ago

tingstad commented 2 years ago

I have observed two potential issues:

1. reload duplication

Sometimes when changing a file I see several lines of Successfully reloaded. For example when I run touch file. I suspect this is because touch (and other operations?) are not atomic. Maybe only small modules are reloaded multiple times, and performance heavy modules only have time for one reload? May thus not be a real problem. (A possible solution could be some kind of debounce, but that would also limit hotload speed?)

2. reload crash

Some editors make the hotloader crash. For example running echo lib.py | python hotload.py lib.py works fine for some editors, but when editing lib.py in Vim om my mac I get:

  File "/Users/richard/dev/hotload/hotload.py", line 43, in _file_changed
    return os.stat(f).st_mtime
FileNotFoundError: [Errno 2] No such file or directory: '/Users/richard/dev/hotload/lib.py'

Again, I suspect that Vim does something non-atomic on save.

teodorlu commented 2 years ago

Again, I suspect that Vim does something non-atomic on save.

I think I've experienced somthing similar. I suspect there are swap files in use, and we have "bad luck" on save.

Not sure what the best option is here -- but I'd like it to be better.

teodorlu commented 2 years ago

reload duplication

Yup - I've seen this myself too.

Right now, be basically just pretend everything is atomic / sync.

Not sure exactly what's best for this. Perhaps we need to think about threads. Not sure.

A possible first step could be

teodorlu commented 2 years ago

Suggestions much welcome :)

teodorlu commented 2 years ago

image

Perhaps we can solve this with just a bit of defensive coding?

  1. Check whether file exists before reading last modified time
tingstad commented 2 years ago

Perhaps we can solve this with just a bit of defensive coding?

1. Check whether file exists before reading last modified time

I'm afraid that the problem may still occur. How about catching FileNotFound and doing a couple of quick retry-with-delay?

teodorlu commented 2 years ago

That might work.

Are we able to time how long Vim needs us to wait? I don't want us to wait many orders of magnitude longer.

teodorlu commented 2 years ago

Tried testing myself.

Setup:

  1. watchexec echo action in a folder
  2. Edit a file with NVIM v0.7.0 in another terminal.

I just got a single "action" printed when i saved a file with Neovim.

teodorlu commented 2 years ago

I'm unable to reproduce hotload crashes with touch right now:

image

But I've seen similar stuff previously

teodorlu commented 2 years ago

Note - I introduced a feature flag for disabling screen clearing. Helpful when we actually want to see every change.

https://github.com/teodorlu/hotload/blob/e4ff11cf89a7a6cd5c1b5a318641e004b4e98fe1/hotload.py#L76-L77

tingstad commented 2 years ago

Are we able to time how long Vim needs us to wait?

Interesting link: https://gist.github.com/kputnam/5972851#begin-saving

The revelant parts which actually deal with the filesystem are:

  1. Move the original file out of the way, to source.html~.
  2. Write the buffer contents to a new file, source.html
  3. Remove the source.html~ backup file after flushing output

I did an experiment in Ubuntu Docker (strace -tt -p PID):

20:15:44.007689 select(1, [0], [], [0], {tv_sec=0, tv_usec=0}) = 0 (Timeout)
20:15:44.007752 write(1, "\r", 1)       = 1                                                
20:15:44.007986 stat("/foo", {st_mode=S_IFREG|0644, st_size=5, ...}) = 0                   
20:15:44.008073 access("/foo", W_OK)    = 0                                                                                                                                            
20:15:44.008122 getcwd("/", 4096)       = 2                                                
20:15:44.008191 write(1, "\33[?25l\"foo\"", 11) = 11                                       
20:15:44.008246 stat("foo", {st_mode=S_IFREG|0644, st_size=5, ...}) = 0                                                                                                                
20:15:44.008358 access("foo", W_OK)     = 0                                                
20:15:44.008520 getxattr("foo", "system.posix_acl_access", 0x7fff11ba14c0, 132) = -1 ENODATA (No data available)
20:15:44.008702 stat("foo", {st_mode=S_IFREG|0644, st_size=5, ...}) = 0                                                                                                                
20:15:44.008812 lstat("foo", {st_mode=S_IFREG|0644, st_size=5, ...}) = 0                   
20:15:44.009040 lstat("4913", 0x7fff11ba1860) = -1 ENOENT (No such file or directory)
20:15:44.009210 openat(AT_FDCWD, "4913", O_WRONLY|O_CREAT|O_EXCL|O_NOFOLLOW, 0100644) = 3  
20:15:44.009536 fchown(3, 0, 0)         = 0                                                
20:15:44.009747 stat("4913", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0        
20:15:44.009999 close(3)                = 0                                                                                                                                            
20:15:44.010145 unlink("4913")          = 0                                                
20:15:44.010309 stat("foo~", 0x7fff11ba1570) = -1 ENOENT (No such file or directory)       
20:15:44.010722 stat("foo", {st_mode=S_IFREG|0644, st_size=5, ...}) = 0   
20:15:44.010775 stat("foo~", 0x7fff11ba0540) = -1 ENOENT (No such file or directory)
20:15:44.010813 unlink("foo~")          = -1 ENOENT (No such file or directory)
20:15:44.010851 rename("foo", "foo~")   = 0
20:15:44.011062 fsync(4)                = 0
20:15:44.011511 openat(AT_FDCWD, "foo", O_WRONLY|O_CREAT, 0644) = 3
20:15:44.011886 ftruncate(3, 0)         = 0
20:15:44.012028 write(3, "asdaa\n", 6)  = 6
20:15:44.012108 fsync(3)                = 0
20:15:44.020217 stat("foo", {st_mode=S_IFREG|0644, st_size=6, ...}) = 0
20:15:44.020276 stat("foo", {st_mode=S_IFREG|0644, st_size=6, ...}) = 0
20:15:44.020464 fchmod(3, 0100644)      = 0
20:15:44.020574 close(3)                = 0
20:15:44.020612 setxattr("foo", "system.posix_acl_access", "\2\0\0\0\1\0\6\0\377\377\377\377\4\0\4\0\377\377\377\377 \0\4\0\377\377\377\377", 28, 0) = 0
20:15:44.020947 write(1, " 1L, 6C written", 15) = 15
20:15:44.021084 lseek(4, 0, SEEK_SET)   = 0
20:15:44.021115 write(4, "b0VIM 8.1\0\0\0\0\20\0\0T\370\301b\323\232;\0\213\2\0\0root"..., 4096) = 4096
20:15:44.021335 select(1, [0], [], [0], {tv_sec=0, tv_usec=0}) = 0 (Timeout)
20:15:44.021635 stat("/foo", {st_mode=S_IFREG|0644, st_size=6, ...}) = 0
20:15:44.021716 unlink("foo~")          = 0

So in this case about 20:15:44.021635 - 20:15:44.007986 = 0.013649 seconds = 13.649 millisecs

I think it's worth waiting a bit when this happens, so we don't miss an update.

teodorlu commented 2 years ago

So, does it make sense to:

  1. Try to read st_mtime
  2. If we catch FileNotFound - wait 20 ms and try again (max 3 times)

?

tingstad commented 2 years ago

Added retry: 2a73100b34eed0f1cc9492f8b6172209685d11ee