Closed twisted-trac closed 8 years ago
@adiroiban removed owner |
---|
Thanks for the updated patch.
I don't have a good understand of how endpoints should be designed and also I have never worked with HAProxy or any other proxy, so I am doing just a high level review here.
Hope it helps.
Feel free to ignore it if it does not make sense.
This patch adds a nice functionality for endpoints, but from my point of view, the endpoint functionality is poorly advertised.
I think that is is very important to include a few notes about the new functionality in the narrative documentation dedicated to endpoints:
An example would also be great :)
I would say that the release note fragment should also advertise that HAProxy is available as an endpoint.
For my comments from the previous review, I would prefer if those questions were answered with: see documentation / see apidoc :)
I tried to make those comments from the point of view of a normal/simple Twisted user and I think that they are important questions which should be answered in the documentation / apidocs.
We should not expect each Twisted user to read each comment from each review request :)
I have sent your patch to our buildbot.
New patches should pass on all current supported builders.
You can check the results here https://buildbot.twistedmatrix.com/boxes-supported?branch=/branches/haproxy-endpoint-8203
You can check the generated API docs here https://buildbot.twistedmatrix.com/builds/apidocs/apidocs-7117ae33aa236f2217e9d6acf9ed0cca5fa8c39c.tar.bz2
You can also check the coverage - https://buildbot.twistedmatrix.com/builds/twisted-coverage.py/twisted-fedora23py2.7-coverage.py-r7117ae33aa236f2217e9d6acf9ed0cca5fa8c39c/
I am not sure if new patches should be accepted if they break Py3. I would say that all new patches should also consider working on py3... unless they depend on something which is not yet ported.
I am not sure what is the coverage requirement for new patches, but I would say that new patches should have 100% coverage. I am not saying that 100% coverage will solve all problems, but I think that 100% is a good start.
New code should have 100% apidocs. Even for private methods.
See some missing methods: https://buildbot.twistedmatrix.com/builders/twistedchecker/builds/3853/steps/run-twistedchecker/logs/new%20twistedchecker%20errors
I would say that the new code can have better documentation :)
For example in HAProxyProtocol what is the purpose of self.info ? The name is very generic. It is very important to describe the purpose of that variable so that a future refactoring would know how to handle it and what was the initial design.
Based on current usage it reads like a flag, so a better name would be something like 'wasProxyHeaderParsed` ... if the proxy header is imporant, I think that holding it in a varialbe named 'haProxyHeader' is better than this generic 'info'.
I can only see it referenced as self.info. Why do we need it as a public attribute?
Do we need to export the parser as public objects? Why do we need the new public interfaces? How are they supposed to be used by Twisted users?
I am leaving this for review as I only did a superficial review.
I hope that having a branch for the patch and having the test executed for this branch helps :)
Thanks again for your contribution.
@glyph set owner to kconwayatlassian |
---|
I wanted to put this change up early to get some arch feedback from Glyph (or others who are opinionated about Endpoint). He strongly suggested this needed to be an Endpoint instead of the mixin that existed in a previous iteration of the patchset.
Thanks for doing this. If you need specific review feedback from me, please feel free to send me an email to me, or to the list, to point me at a ticket, or I'll stochastically get around to it at some point (as illustrated here). Generally though, any reviewer should feel empowered to make changes, and not everything needs my personal sign-off.
Mostly I'm looking to get approval/disapproval for
OK, let me go through these one at a time:
the _WrapperServerEndpoint addition that is modelled after _WrapperEndpoint
This looks great. We may want to refactor these a little internally first, but they should probably both be exposed publicly after some re-naming.
the syntax of adding via compounding ports as haproxy:tcp:80
and the package organisation as a series of private modules exposed via the package init which is modelled after the t.logger.
💯
Given the way that Python accidentally re-exports every single !@#% symbol that is imported into your code, this is absolutely the way that I would like to see all Twisted code written going forward. Thank you for doing this.
However! I think we should be doing even more hiding of implementation. Every exposed bit of API surface is potential maintenance cost in the future, and we don't need a lot of it for the desired feature here. There's no particular reason I can think of to expose internal implementation details like V1Parser
or V2Parser
, or even the IProxyParser
interfaces; the information that applications need to retrieve should be coming back from existing transport interfaces like getPeer
. Honestly the only thing that it seems to me should be exported from the entire package is the endpoint, and possibly some exceptions if there is any case (which we'll only really know from the narrative docs) where an application needs to catch it.
If those things are agreed on I can quickly follow up with docs and additional tests before going for a merge.
I'm going to hold you to "quickly" :).
In the future, though, I'd recommend not blocking on a review like this, especially on the narrative docs. I've found that many of the design issues spotted by review are things that I can find myself if I just start working on the docs, because the docs sound stilted; not to mention that the final reviewer will probably start by looking at the docs, which may uncover new issues, so if there's an issue there you'll have to re-submit and do a bunch of work anyway :-).
I have a few other minor administrative concerns, which I apologize if I'm duplicating from Adi's review, but I didn't see them there as I skimmed:
AF_UNSPEC
, LOCAL
, et. al. should be constants (possibly just imported from socket
?)zope.interface.implements
; it is deprecated. Instead, use the zope.interface.implementer
class decorator.from zope.interface import implementer; @implementer ...
and not @zope.interface.implementer
.Generally this looks like great quality code and I'm enthusiastically anticipating its integration. Thank you for submitting it.
Please address adi's comments and mine to your satisfaction and re-submit for review.
@glyph set owner to @glyph |
---|
Thanks for resubmitting. A few points of light feedback:
twistedchecker twisted/protocols/haproxy
emits no results.twisted.internet.endpoints
should not really contain any of this code; if we need it there to integrate with the string parser, we should have a local import in the parse function to avoid 7 additional (infrequently used) additional modules when loading the (almost always used) twisted.internet.endpoints.twisted.protocols.haproxy._exc
module. I think this might just be a conflict with a Windows-specific module named "_exc
"; there doesn't look like there's any hairy platform-specific code in there. Whatever it is, it needs to be addressed before landing though.If you were a Twisted committer, I'd say "address these to your satisfaction and merge", so, the rule is, I will address them to my satisfaction and then merge. In terms of change size the difference will probably be pretty big, but most of it will be futzing around with whitespace for coding-standard and doc-standard compliance. (If I end up having to make more substantial changes, I'll resubmit.)
Thanks again for your contribution.
@glyph set status to closed |
---|
(In [47179]) Merge haproxy-endpoint-8203-2: PROXY endpoint.
Author: kconwayatlassian
Reviewer: glyph
Fixes: #8203
twisted.protocols.haproxy.proxyEndpoint provides an endpoint that wraps any other stream server endpoint with the PROXY protocol that retains information about the original client connection handled by the proxy; this wrapper is also exposed via the string description prefix 'haproxy'; for example 'twistd web --port haproxy:tcp:8765'.
@adiroiban commented |
---|
Hi and thanks for your contribution.
I am doing just a high level review here.
Since HAProxyProtocolMixin has an init method I would argue that it is not a mixin, but a full featured class.
Why do we need it as a mixin?
The current usage looks prone to errors.. as you might forget to call
Can we have it as a protocol helper?
Maybe like this:
from twisted.words.protocols.jabber import xmlstream
class XmlStream(xmlstream.XmlStream):
def __init__(self, *args, **kwargs):
xmlstream.XmlStream.__init__(self, *args, **kwargs)
self._haProxy = HAProxyHelper()
def dataReceived(self, data):
if self._haProxy.proxyParsed:
return xmlstream.XmlStream.dataReceived(self, data)
self._haProxy.dataReceived(data)
# If the header was parsed, continue with the remaining bytes.
if self._haProxy.proxyParsed and self._haProxy.overflowBuffer:
xmlstream.XmlStream.dataReceived(self, self.overflowBuffer)
I think that a protocol wrapper would be best so we will end up something like this.
I am just using lazr.delegates for simplicity
Maybe my example will not work... I am just giving some possible unproven solutions here.
Also, I don't know much about how you would want to use the parsed header in real life... so my suggestion could be completely wrong.
HAProxyProtocolWrapper can we very similar the current implementation, but it will have a state machine which will call the wrapped protocol once the haproxy headers were parsed.
@delegate_to(IProtocolFactory, context='_wrapperdFactory')
class HAProxyWrapperFactory(Factory):
IProtocolFactory
def __init__(self, wrappedFactory):
self._wrapperdFactory = wrappedFactory
def buildProtocol(self, addr):
wrappedProtocol = self._wrapperdFactory.buildProtocol(addr)
protocol = HAProxyProtocolWrapper(wrappedProtocol)
return protocol
Does this code support both version 1 and version 2 of the PROXY protocol ?
I think that the docstring should describe the supported protocol version and provide a link to the protocol specification.
From the code I can only see version 1 handling.
The HAProxy protocol is a bit strange. How do you advertise the supported versions? I did not take the time to read the full linked documentation. It looks like version 1 is text mode while version 2 has both text and binary mode.
What would happen with this code if HAProxy add version 3?... I guess that it will just raise a generic InvalidProxyHeader ...
I think that the test coverage could be improved. For example I don't see any tests for the new public InvalidProxyHeader exception.
What is the value of having this in the core Twisted package?
Why not have it as a txhaproxy independent package?
Just asking :)
I am leaving this for review as other can send their feedback.
Many thanks for your contribution.
kconwayatlassian commented |
---|
Stepping through the review from adiroiban and glyph:
Why do we need it as a mixin?
The implementation of this as a mixin class was something that came up very quickly in IRC while talking to glyph. After talking about it for a bit it was clear that the mixin was not the right solution and the Endpoint interfaces should be leveraged instead. The latest patch removes the mixin business and works as an Endpoint/Factory/Protocol wrapper.
Does this code support both version 1 and version 2 of the PROXY protocol ?
The original patch only implemented version 1. This latest patch also supports version 2. Version 2 is a bit more complex since it is a binary format and contains a richer data set. I had originally intended to follow up with a second patch to add support for it, but I found the time to add it now.
The HAProxy protocol is a bit strange. How do you advertise the supported versions?
The protocol specification uses connection closing as the primary mechanism for advertising version support. As in, if you don't support the version you close the connection. There are currently only two versions of the protocol and the specification details the rules by which a receiver that implements both determines which parser to use. This part of the spec is implemented in this patch so services would support both versions simultaneously.
What would happen with this code if HAProxy add version 3?... I guess that it will just raise a generic InvalidProxyHeader
Currently any version higher than 2 will result in a closed connection as required by the spec.
I think that the test coverage could be improved.
Take another look if you would. I've added additional tests for the parsers and the protocol.
What is the value of having this in the core Twisted package?
We felt the PROXY protocol was wide-spread enough that Twisted could benefit from having support in core. There's a list of implementations in the PROXY spec and it includes some of the most widely used proxy/load balancing tools. Currently, running any non-HTTP Twisted service behind a load balancer means the true peer and host data are lost. Having this in core means Twisted developers can, without pulling in other deps, run any service behind a load balancer without sacrificing that information.
kconwayatlassian commented |
---|
@adiroiban
I appreciate the branch and the review. I'll take a look at the test coverage gaps and see what's missing. As far as docs go, I completely agree with your feedback. Adding more API and narrative docs is on my todo list before finalising the patchset. I wanted to put this change up early to get some arch feedback from Glyph (or others who are opinionated about Endpoint). He strongly suggested this needed to be an Endpoint instead of the mixin that existed in a previous iteration of the patchset.
Mostly I'm looking to get approval/disapproval for the _WrapperServerEndpoint
addition that is modelled after _WrapperEndpoint
, the syntax of adding via compounding ports as haproxy:tcp:80
, and the package organisation as a series of private modules exposed via the package init which is modelled after the t.logger
.
If those things are agreed on I can quickly follow up with docs and additional tests before going for a merge.
@adiroiban commented |
---|
I think that the narrative docs will help here, as we can see how this will be handled from an end-user point of view and assert if it make sense and is easy to use.
If it is easy to describe/document in a accurate way, then it might be also be easy to use and might a good design :)
Please consider breaking this patch as it should make it easier for someone to review it... it harder to do after hours review for big diffs
Thanks!
@adiroiban commented |
---|
Thanks for the patch.
Latest patch applied and sent to buildbot.
The narrative docs look ok. I would say that maybe it would help to be explicit and inform that both protocol v1 and v2 are supported.
Leaving it for review as this is a big patch and I don't have time to review it.
@glyph commented |
---|
The Windows issue is simply that AF_UNIX
doesn't exist on Windows. I feel kind of silly for not seeing it.
I'll address it by just removing the NETFAMILIES
and NETPROTOCOLS
dictionaries; the lookup is unnecessary, because "48" is already a protocol-defined integer, it just needs a human-readable name. (I might use ValueConstant
but since this is not a public API it shouldn't matter much.)
HAProxy annotates the beginning of every TCP stream with connection data. The header format and expected service behaviour are defined as the PROXY protocol (http://www.haproxy.org/download/1.5/doc/proxy-protocol.txt).
Add support for handling the PROXY protocol in all Protocol implementations.
Attachments:
Searchable metadata
``` trac-id__8203 8203 type__enhancement enhancement reporter__kconwayatlassian kconwayatlassian priority__normal normal milestone__None None branch__branches_haproxy_endpoint_8203_2 branches/haproxy-endpoint-8203-2 branch_author__adiroiban adiroiban status__closed closed resolution__fixed fixed component__core core keywords__None None time__1455310370390963 1455310370390963 changetime__1459991483592152 1459991483592152 version__None None owner__glyph glyph ```