codekitchen / dinghy

faster, friendlier Docker on OS X. Deprecated.
MIT License
2.12k stars 109 forks source link

Files not Syncing - NFS.output shows "unable to send RPC reply" #271

Open josvazg opened 6 years ago

josvazg commented 6 years ago
cat /usr/local/Cellar/dinghy/4.5.0/local/var/dinghy-NFS.output 
unable to send RPC reply
unable to send RPC reply
unable to send RPC reply
unable to send RPC reply
unable to send RPC reply
unable to send RPC reply
unable to send RPC reply
unable to send RPC reply
unable to send RPC reply
$ dinghy version
Dinghy 4.5.0

Nothing relevant in logs

docker@dinghy:~$ cat /var/log/* |grep nfs
docker@dinghy:~$

Why? How can I debug?

josvazg commented 6 years ago

Not sure, but I think I got to see 2 nfs process running at the same time on my mac host when this was happening. Once I restarted and there was a single process, it seemed to work properly again.

codekitchen commented 6 years ago

Have you run into this again? Were both NFS processes the unfs3 daemon or was one system NFS?

josvazg commented 6 years ago

I must admit it has been quite stable since last time... Maybe it is some usage pattern. I hit it more often when I run a webserver deployment with several containers (incldue dbs etc) but it is more stable when I just run a simple comtainer with a CLI tool.

Anyway, if it happens again I will check that. I seem to recall they were 2 unfs processes, but I can't be quite sure now.

josvazg commented 6 years ago

Happened again, here is the ps dump:

$ ps aux |grep nfs
josvaz           23592   0,0  0,0  2465752    736   ??  S    18abr18   0:34.84 /usr/local/sbin/unfsd -e /Users/josvaz/.dinghy/machine-nfs-exports-dinghy -n 19930 -m 19930 -l 192.168.99.1 -p -d
josvaz            3738   0,0  0,0  2465752   4580   ??  S     7abr18  11:10.09 /usr/local/sbin/unfsd -e /Users/josvaz/.dinghy/machine-nfs-exports-dinghy -n 19299 -m 19299 -l 192.168.99.1 -p -d
josvaz           94844   0,0  0,0  2442376   2472 s001  S+    6:54PM   0:00.01 grep nfs
ddonahue99 commented 5 years ago

I may be encountering the same issue. I see the same log output (unable to send RPC reply). I noticed this issue because webpack-dev-server wasn't picking up file changes.

My dinghy status shows everything is running, and I see nothing out of the ordinary when starting dinghy.

Interestingly, if I invoke fsevents_to_vm alongside dinghy while it's running, webpack-dev-server works great. I know of at least one other person encountering the same problem currently. I am on dinghy 4.6.5.

codekitchen commented 5 years ago

Perhaps you've hit a state where the fsevents_to_vm daemon is still running but not functioning... do you see any relevant log output at /usr/local/Cellar/dinghy/4.6.5/local/var/dinghy-FsEvents.output?

ddonahue99 commented 5 years ago

From a fresh dinghy start (with fsevents_to_vm not running separately), there's nothing interesting in the FsEvents log. Just this:

=== Starting FsEvents at 2019-04-24T11:27:59-07:00 ===

Triggering some events doesn't produce any log output.

If I do run fsevents_to_vm separately, I do get output in that process showing all the files that have been touched.

codekitchen commented 5 years ago

That's expected that you won't see touched files in the log, fsevents_to_vm is silent by default and only prints touch events if you pass --debug. That's too bad that there's no relevant crash information though. I don't have a great theory for why the daemon process wouldn't work but a manually-run process would.

Maybe I should add an option to dinghy to pass the --debug option through to the daemon process. In the meantime you could modify /usr/local/Cellar/dinghy/4.6.5/cli/dinghy/fsevents_to_vm.rb and add --debug to the args array near the bottom, then dinghy restart. That should print touch events to FsEvents.output and we can at least verify if events are being generated.

ddonahue99 commented 5 years ago

Good thinking. Can confirm that with --debug output in the dinghy fsevents process, I'm seeing output for touched files. (Also verified I'm still seeing the same behavior)

codekitchen commented 5 years ago

hmm if it's logging the file touches but not logging any errors, I really don't know how it could be failing in a way that wouldn't also affect an independently-run process. Is the problem consistent, or does it sometimes work on the affected macs?

Also, how long did you watch the log output for? I'm just thinking that I'm not entirely sure what the default timeout is for Net::SSH if it is having trouble connecting. Could be 30-60 seconds or something.