rustyio / sync

On-the-fly recompiling and reloading in Erlang. Code without friction.
MIT License
748 stars 163 forks source link

sync crashing on nofile #57

Open mattiasw2 opened 9 years ago

mattiasw2 commented 9 years ago

Really like sync. I am also using EDTS for Emacs, and I think there are some strange interactions.

The core issue is

sync_scanner:start_link() at <0.163.0> exit with reason no case clause matching {error,nofile} in sync_scanner:reload_if_necessary/7 line 500 in context child_terminated

i.e. the beam file is missing. Most likely, it is recompiled by "someone else".

I tried making a simple patch to wait 1000ms and retry, but didn't solve. it.

Most often, the failure is recovered, but sometimes, it takes the erlang down, as seen below.

(share3@127.0.0.1)1> Scanning source files...
(share3@127.0.0.1)1> 10:45:29.797 [error] gen_server sync_scanner terminated with reason: no case clause matching {error,nofile} in sync_scanner:reload_if_necessary/7 line 500
(share3@127.0.0.1)1> 10:45:29.797 [error] CRASH REPORT Process sync_scanner with 0 neighbours exited with reason: no case clause matching {error,nofile} in sync_scanner:reload_if_necessary/7 line 500 in gen_server:terminate/7 line 804
(share3@127.0.0.1)1> 10:45:29.798 [error] Supervisor sync had child sync_scanner started with sync_scanner:start_link() at <0.163.0> exit with reason no case clause matching {error,nofile} in sync_scanner:reload_if_necessary/7 line 500 in context child_terminated
(share3@127.0.0.1)1> Scanning source files...
(share3@127.0.0.1)1> 10:45:30.163 [error] gen_server sync_scanner terminated with reason: no case clause matching {error,nofile} in sync_scanner:reload_if_necessary/7 line 500
(share3@127.0.0.1)1> 10:45:30.164 [error] CRASH REPORT Process sync_scanner with 0 neighbours exited with reason: no case clause matching {error,nofile} in sync_scanner:reload_if_necessary/7 line 500 in gen_server:terminate/7 line 804
(share3@127.0.0.1)1> 10:45:30.164 [error] Supervisor sync had child sync_scanner started with sync_scanner:start_link() at <0.1657.0> exit with reason no case clause matching {error,nofile} in sync_scanner:reload_if_necessary/7 line 500 in context child_terminated
(share3@127.0.0.1)1> Scanning source files...
(share3@127.0.0.1)1> 10:45:30.501 [error] gen_server sync_scanner terminated with reason: no case clause matching {error,nofile} in sync_scanner:reload_if_necessary/7 line 500
(share3@127.0.0.1)1> 10:45:30.501 [error] CRASH REPORT Process sync_scanner with 0 neighbours exited with reason: no case clause matching {error,nofile} in sync_scanner:reload_if_necessary/7 line 500 in gen_server:terminate/7 line 804
(share3@127.0.0.1)1> 10:45:30.502 [error] Supervisor sync had child sync_scanner started with sync_scanner:start_link() at <0.1668.0> exit with reason no case clause matching {error,nofile} in sync_scanner:reload_if_necessary/7 line 500 in context child_terminated
(share3@127.0.0.1)1> Scanning source files...
(share3@127.0.0.1)1> 10:45:30.835 [error] gen_server sync_scanner terminated with reason: no case clause matching {error,nofile} in sync_scanner:reload_if_necessary/7 line 500
(share3@127.0.0.1)1> 10:45:30.835 [error] CRASH REPORT Process sync_scanner with 0 neighbours exited with reason: no case clause matching {error,nofile} in sync_scanner:reload_if_necessary/7 line 500 in gen_server:terminate/7 line 804
(share3@127.0.0.1)1> 10:45:30.835 [error] Supervisor sync had child sync_scanner started with sync_scanner:start_link() at <0.1679.0> exit with reason no case clause matching {error,nofile} in sync_scanner:reload_if_necessary/7 line 500 in context child_terminated
(share3@127.0.0.1)1> Scanning source files...
(share3@127.0.0.1)1> 10:45:31.330 [error] gen_server sync_scanner terminated with reason: no case clause matching {error,nofile} in sync_scanner:reload_if_necessary/7 line 500
(share3@127.0.0.1)1> 10:45:31.338 [error] CRASH REPORT Process sync_scanner with 0 neighbours exited with reason: no case clause matching {error,nofile} in sync_scanner:reload_if_necessary/7 line 500 in gen_server:terminate/7 line 804
(share3@127.0.0.1)1> 10:45:31.338 [error] Supervisor sync had child sync_scanner started with sync_scanner:start_link() at <0.1692.0> exit with reason no case clause matching {error,nofile} in sync_scanner:reload_if_necessary/7 line 500 in context child_terminated
(share3@127.0.0.1)1> 10:45:31.965 [error] gen_server sync_scanner terminated with reason: no case clause matching {error,nofile} in sync_scanner:reload_if_necessary/7 line 500
(share3@127.0.0.1)1> 10:45:31.967 [error] CRASH REPORT Process sync_scanner with 0 neighbours exited with reason: no case clause matching {error,nofile} in sync_scanner:reload_if_necessary/7 line 500 in gen_server:terminate/7 line 804
(share3@127.0.0.1)1> 10:45:31.968 [error] Supervisor sync had child sync_scanner started with sync_scanner:start_link() at <0.1703.0> exit with reason no case clause matching {error,nofile} in sync_scanner:reload_if_necessary/7 line 500 in context child_terminated
(share3@127.0.0.1)1> 10:45:31.969 [error] Supervisor sync had child sync_scanner started with sync_scanner:start_link() at <0.1703.0> exit with reason reached_max_restart_intensity in context shutdown
(share3@127.0.0.1)1> 10:45:31.969 [info] Application sync exited with reason: shutdown
(share3@127.0.0.1)1> 10:45:31.969 [error] Supervisor share3_sup had child share3 started with share3:start() at <0.156.0> exit with reason noproc in context shutdown_error
(share3@127.0.0.1)1> 
=INFO REPORT==== 8-May-2015::10:45:32 ===
stop_ready([], <0.173.0>)
(share3@127.0.0.1)1> 
=INFO REPORT==== 8-May-2015::10:45:32 ===
stop_ready([{{1431,74730,66399},<0.175.0>}], <0.440.0>)
(share3@127.0.0.1)1> {"Kernel pid terminated",application_controller,"{application_terminated,sync,shutdown}"}

Crash dump was written to: erl_crash.dump
Kernel pid terminated (application_controller) ({application_terminated,sync,shutdown})
mattias@ubuntu:~$ 
s1eepwalker commented 9 years ago

same error Ubutnu 14.04 x64 erlang:system_info(version). "6.2" erlang:system_info(otp_release). "17"

choptastic commented 9 years ago

Interesting, thanks for posting this. It's definitely not something I've experienced (though I don't use Emacs).

I think I see what's going on here, but as I'm not experiencing this, it might be tough to reproduce, but at the very least, I've noticed one problem with the code, so I can start there, and if that doesn't fix it, I can install emacs and EDTS to see what I'm able to uncover.

Thanks again for the report and the confirmation.

mattiasw2 commented 9 years ago

Just an idea: It might also the symlinks created by "relx -d" that is the problem.

boozelclark commented 9 years ago

I am having the same issue and using sync with rebar3 so also using relx. Ubuntu 15.04 with erts 6-4

choptastic commented 9 years ago

Thanks for all the confirmation, guys. I haven't yet had a chance to do much open-source work these past few months (I've been in a bit of a crunch mode), but as soon as my schedule clears up a bit, I'll be able to put out a fix for this.

The immediate hacky fix would be to add another error clause to https://github.com/rustyio/sync/blob/master/src/sync_scanner.erl#L500 and see if that solves the problem. I'm not entirely sure what the correct approach should be to handle if it returns {error, nofile} like it's doing, but my suspicion would be to just generate a warning and see how prevalent the problem is.

That said, I've noticed that https://github.com/rustyio/sync/blob/master/src/sync_scanner.erl#L498 and https://github.com/rustyio/sync/blob/master/src/sync_scanner.erl#L524 are being called redundantly, which seems to be a problem, but probably unrelated to this particular error.

mattiasw2 commented 9 years ago

In my case, it happens for Yaws compilation of files. I have not fixed it, it just tells me when it fails. I use embedded yaws.

https://github.com/mattiasw2/sync/commit/75ef60964b6f042cbee2321c2d87dd80e02b3356