twisted / twisted

Event-driven networking engine written in Python.
https://twisted.org
Other
5.57k stars 1.17k forks source link

Adding a new filedescriptor to epoll reactor should not result in ENOENT #6373

Open twisted-trac opened 11 years ago

twisted-trac commented 11 years ago
itamarst's avatar @itamarst reported
Trac ID trac#6373
Type defect
Created 2013-03-17 16:17:44Z
Branch https://github.com/twisted/twisted/tree/resurrected-fd-6373

In certain mysterious situations (see #6346) info from an old filedescriptor remains in the reactor state; this can break adding a new filedescriptor with the same fd number, e.g. resulting in a ENOENT.

This ticket covers dealing gracefully with this problem, effectively a workaround that mitigates the original issue (still being covered by [#6346](https://github.com/twisted/twisted/issues/6346)). A workaround is useful even if we solve #6346 insofar as it makes the reactor more robust against future bugs.

Searchable metadata ``` trac-id__6373 6373 type__defect defect reporter__itamar itamar priority__normal normal milestone__ branch__branches_resurrected_fd_6373 branches/resurrected-fd-6373 branch_author__itamarst itamarst status__new new resolution__None None component__core core keywords__ time__1363537064000000 1363537064000000 changetime__1551001424573919 1551001424573919 version__None None owner__itamar itamar ```
twisted-trac commented 11 years ago
therve's avatar @therve set owner to @itamarst

Comments:

twisted-trac commented 11 years ago
itamarst's avatar @itamarst commented

(In [37532]) Branching to 'resurrected-fd-6373'

twisted-trac commented 5 years ago
msdemlei's avatar @msdemlei commented

Looks like I'm running into a variant of this. After ~two weeks or so, a reasonably lively twisted.web instance starts throwing IOErrors in self._poller.modify -- and that's it, connections are refused and just yield IOErrors in the server log; that is on Debian stretch twisted, python-twisted 16.6.0, on python2.7 (yeah, I know).

The patch proposed here has never been merged, apparently for concerns it will hide bugs (or so I read the comment in epollreactor). Still, it also sucks if something is failing hard after running for two weeks without a clear way to debug things -- if this is really due to a stale file descriptor left lying around, the triggering code can have run at any time in the previous two weeks, right?

So:

(a) any chance something like this patch might go in after all? At least there'll be a log entry about the failure, so it's not that a bug becomes totally invisible.

(b) is there a good strategy to debug a situation in which such a bug is triggered?

PS: my tracebacks look like this:

Traceback (most recent call last):
      File "/usr/lib/python2.7/dist-packages/twisted/python/log.py", line 86, in callWithContext
        return context.call({ILogContext: newCtx}, func, *args, **kw)
      File "/usr/lib/python2.7/dist-packages/twisted/python/context.py", line 118, in callWithContext
        return self.currentContext().callWithContext(ctx, func, *args, **kw)
      File "/usr/lib/python2.7/dist-packages/twisted/python/context.py", line 81, in callWithContext
        return func(*args,**kw)
      File "/usr/lib/python2.7/dist-packages/twisted/internet/posixbase.py", line 597, in _doReadOrWrite
        why = selectable.doRead()
    --- <exception caught here> ---
      File "/usr/lib/python2.7/dist-packages/twisted/internet/tcp.py", line 1072, in doRead
        transport = self.transport(skt, protocol, addr, self, s, self.reactor)
      File "/usr/lib/python2.7/dist-packages/twisted/internet/tcp.py", line 789, in __init__
        self.startReading()
      File "/usr/lib/python2.7/dist-packages/twisted/internet/abstract.py", line 434, in startReading
        self.reactor.addReader(self)
      File "/usr/lib/python2.7/dist-packages/twisted/internet/epollreactor.py", line 110, in addReader
        EPOLLIN, EPOLLOUT)
      File "/usr/lib/python2.7/dist-packages/twisted/internet/epollreactor.py", line 94, in _add
        self._poller.modify(fd, flags)
    exceptions.IOError: [Errno 2] No such file or directory
twisted-trac commented 5 years ago
exarkun's avatar @exarkun commented

It looks like there is some unaddressed review feedback. The next steps would be to rescue the branch in question into git/github, address the review feedback, and re-submit the ticket for review.

twisted-trac commented 5 years ago
msdemlei's avatar @msdemlei commented

Replying to Jean-Paul Calderone:

It looks like there is some unaddressed review feedback. The next steps would be to rescue the branch in question into git/github, address the review feedback, and re-submit the ticket for review.

I'm afraid you already essentially lost me at the "rescue branch into github" part -- as in: is there a convenient way to do that or would I need to produce a PR from my end?

On the other hand, my impression was that the code patched has changed so much in the last six years that not much would be left of the patch after adapting it to the current code.

Anyway, Itamar, if you're reading this, do you remember why you abandoned the ticket?

twisted-trac commented 11 years ago
itamarst's avatar @itamarst commented

http://buildbot.twistedmatrix.com/boxes-supported?branch=/branches/resurrected-fd-6373 is running.

I punted the poll issue, discovered by the new test, to #6374.

twisted-trac commented 11 years ago
itamarst's avatar @itamarst commented

kqueue and cfreactor can also be addressed by separate tickets, and until then just skip the test on those reactors.