Changed Windows socket timeout from 1s -> 30s

Bilge commented 6 years ago

One second is an insanely short timeout for a socket to respond. Many web services delivering heavy web applications easily trip this timeout during normal use. See #21.

trowski commented 6 years ago

That timeout is for the process wrapper connecting to the parent process socket server. My understanding is that what the process then does should have little affect on this.

Ping @DaveRandom

DaveRandom commented 6 years ago

TL;DR -1, but would change to +1 if you make it a runtime-configurable option rather than changing the hard-coded value, keeping the 1 second default.

The timeout here is only for initiating local IPC, over TCP loopback (no network latency, relatively low I/O overheads). It's reasonable to assume in those cases that if a local process has not connected after 1 second, something has gone wrong. If the underlying loop is taking more than a second to process a complete tick, probably the application needs to be restructured.

I am seeing issues where stuff does go wrong and these sockets timeout, but I don't think extending the timeout will help here as the processes that go missing are not queued, rather they just disappear into the ether.

With something like this, generally you want to fail as early as possible. OTOH, there may be use-cases that I cannot think of, and as such there may be some application-specific scenarios where you may want to increase the timeout - but in this case, it should be some form of publicly configurable option, rather than changing the default.

kelunik commented 6 years ago

There's probably a bug somewhere else that causes this as already said in #21.

Bilge commented 6 years ago

If the underlying loop is taking more than a second to process a complete tick, probably the application needs to be restructured.

If you're implying that I can't code, please take a look at the associated issue (#21). Although the code looks contrived, it's not (with the notable exception that I'm hitting http://store.steampowered.com/app/$id/?cc=us instead of example.com); this is exactly the code I'm using to reproduce this issue. If my code is bad please point out the mistake.

kelunik commented 6 years ago

@Bilge There's no such implication. It could also be us doing accidental blocking somewhere. In fact, I know a location where such a blocking could happen.

DaveRandom commented 6 years ago

Note also this comment:

I am seeing issues where stuff does go wrong and these sockets timeout, but I don't think extending the timeout will help here as the processes that go missing are not queued, rather they just disappear into the ether.

There is a known problem in #21 (and I am looking into it) but it's not caused by connections timing out, that's just a symptom. For an as-yet-undetermined reason, there is a tipping point where a large number of active child processes causes spawning new children to silently - and sometimes only partially - fail. So far, however, I have been unable to find a reproducing case that gives consistent behaviour, and it seems to be somewhat dependent on overall system load, which doesn't seem to make sense according to the documented behaviours of the underlying win32 API calls.

I have a suspicion that the crux of the issue is that Windows does not really want you to use multi-process models and would rather you use threads, but at worst I expect there will be a trappable error that is currently being ignored/swallowed by something.

kelunik commented 6 years ago

Closing, as #24 landed.

amphp / process

Changed Windows socket timeout from 1s -> 30s #22