Open MoritzS opened 11 years ago
Indeed. While I'd like to focus on Frames first, I think both would benefit from this kind of overhaul.
We do want very cleanly separated classes for server and client where it makes sense, but for most persistent connection transports, it's wasteful not to use the same pipe where already available. The question is how to make a system that works in a satisfactory way despite the wildly varying needs and models of different transports.
Well, it took awhile, but I think I can finally answer that question.
Generic among the Jack subsystem, i.e., not subclassed by individual jacks. Initialized with two callback properties, read and write. write is called by the router to send outgoing messages. read is called by whatever jack receives an incoming message for that connection.
>>> Connection(router, read, write, address_local, address_remote)
Class that handles incoming connections. Creates a new Connection object when connected to.
Each module exposes this as reader_class
.
>>> reader = ejtp.jacks.tcp.reader_class(router, address)
Class that handles outgoing connections. Creates a new Connection object when a frame is being sent somewhere we don't already have a connection for.
Each module exposes this as writer_class
. May be equivalent to reader_class
to signal that a single class performs both duties. The writer_ignores_interface
module variable tells the router to do 2 things if True: initialize the class without a second parameter, and use (class, None)
as its key in the router's jack table. Needless to say, since reader_class requires an interface to bind to, it's a really bad idea to set both reader_class == writer_class && writer_ignores_interface
.
>>> ejtp.jacks.tcp.writer_ignores_address # Interface is ignored by some writer classes, but not this one
False
>>> writer = ejtp.jacks.tcp.writer_class(router, interface)
>>> writer.connect(address)
<ejtp.jacks.connection.Connection object at ...>
Manage jacks and connections. Connections register as they are created. Always use an existing connection where available, otherwise create a new one with the appropriate jack module's writer class.
Jack reuse is an important topic - what makes a jack distinct? Class + address truncation (2 elements long). So we can use (class, addr)
tuples as keys in the router's jack cache. This seamlessly handles the logic in situations where there are not separate Reader and Writer classes, but a single dual-purpose class.
"Create on need" is pretty straightforward with the Writer class. Clients can probably get away with just caring about setting up a Reader so they can take incoming data from the network. So we'll go with a hybrid solution along that line - create Reader on Client init, if make_jack = True
, and create Writers on the fly as needed.
Let's model it based on how we do it in frames - a global registration module in ejtp.jacks, with a registerJack decorator and a createJack(address, is_reader) function.
While we're at it, make one of the attributes of the BaseJack the "router key", AKA (class, address) tuple. Maybe BaseJack.router_key, nice and simple.
Edited to be less shaky and incorporate stuff decided in discussion. Can hopefully be worked on in parallel with Router changes (which may not be totally independent of these changes, but we can do some temporary router bandaging in the very short term while waiting for a proper router reimplementation).
These are some really well thought proposals! But I have some questions and suggestions:
writer_ignore_interface
do? Does that just mean that the reader/writer gets called without an interface argument?reader
and writer
. In my opinion that's just a bit clearer.Heh, thanks, it was really late in my timezone when I wrote them, and I wasn't totally sure which side of the straddled hazy/genius line this would ultimately fall on :)
Sorry if there's chaos in your email notifications, I had a bit of minor catastrophe trying to fix the formatting after making this reply via email.
Bounty is $100.
http://www.freedomsponsors.org/core/issue/197/improve-handling-of-jacks
(Copied from acceptance criteria)
Work with me and possibly other developers to create a registered jack system, according to designs determined in conversation in the issue tracker.
Feature may or may not ship in stable at the end of this month, depending on its readiness. If it doesn't land in 0.9.3, we'll just push it to 0.9.4.
I don't totally grasp the (old) router system, yet. Why exactly is it necessary to distinct between Clients and Jacks? Or is this just a thing that will become obsolete if we have the new Connection class with readers and writers?
Historically, it's because client addresses are longer, with that third "local identifier" element. So for outbound messages, you have to check a trimmed version of the frame address against available jack addresses.
I think we'll still want to keep the two distinct, even if we can make address comparison somewhat agnostic to the above problem, because the Connection class would be pointless overhead for local message-passing.
I have some more questions:
is_reader
in createJack
affect? Perhaps making Connections read only?Connection
class replace the Jack
classes? As in the jacks now define a reader and writer instead of a Jack classrouter_key
for? For storing the connection in the Router's cache?reader
class (if True) or its writer
class (if False).class MyWriterJack(...):
def connect(self, address):
read = lambda conn, frame: conn.router.recv(frame)
write = lambda conn, frame: self.send_with_arbitrary_function(frame)
return Connection(self.router, read, write, self.address, address)
Connections are practically glorified tuples, and are created on-the-fly by Jacks. Writer Jacks create them when you want to send a frame somewhere, and a Connection for it doesn't exist yet. Reader Jacks create them when you receive a message from the outside world, and don't have a Connection for that remote host yet.
But now I don't get what the reader and writer classes are needed for. What do they do if the Jacks create the Connections on the fly without using them?
Jacks manufacture Connections for the router to use. If the router doesn't have a connection, it has the writer jack make one. Incoming messages through the reader jack also create connections and register them with the router, so replies can use the same existing socket connection.
When you are talking about writer and reader jacks do you mean the same as writer and reader classes in jack modules?
Yes, exactly.
I think I may need to draw and scan a diagram to get this across unambiguously, but it sounds like we're approaching the same page :)
Yeah, I think a sketch will help a lot to understand this!
Okay, I ended up doing something a bit clearer. It's not done, but thanks to the nature of Google Docs, you can watch me work!
https://docs.google.com/presentation/d/1lcLTDPlcjuWvgxmYTOgX9W5xqLuPBYYA_c8vE13BGOg/edit?usp=sharing
EDIT: Done. Now I kinda want to clean up the Connection spec, since I figured out in the process of graphing everything out that the read
function is really roundabout and pointless. Not sure how much conversation will sound like nonsense if I do that, though.
Well first I have to understand it as it is know ;-) So basically we split the old jack classes in a class that handles writing connections and another that does reading? And connections are client and address dependent, just like TCP connection sockets?
Right. I mean in reality it's less a read/write paradigm, and really a server/client paradigm (respectively) for 90% of all possible jacks, so I'm okay with making that more semantically straightforward and changing the names in this design to match that reality (you'll notice in the presentation, the jack class names were TCPServerJack and TCPClientJack).
And again, you get it. Connections are address-dependent, and each address is 2-element (jack style) instead of 3-element (client style). The router can just find the first connection with the right remote_address and call its write(frame) function to ship a frame there.
I had a first go at MoritzS@f162725d703ce243b49826fa79b6dcad491ba348
Basically I did not implement the reader/writer thing but RegisterJack
checks if the class is subclass of ReaderJack
or WriterJack
. In the end it's the same but I think it's easier to understand like this.
Also the writer_ignores_interface
is now WriterJack.nobind
and WriterJack.router_key
acts accordingly.
My idea of a basic implementation of a Jack is something like this:
# module MyJack
@RegisterJack('myproto')
class MyReaderJack(ReaderJack):
def run(self):
# [...]
self.recv(data)
# [...]
@RegisterJack('myproto')
class MyWriterJack(WriterJack):
nobind = False
def send(self, data):
# [...]
some_send_method(data)
# [...]
The thing with a single class for reader and writer also works:
@RegisterJack('myproto')
class MyReaderJack(ReaderJack, WriterJack):
# all needed methods here
Please tell me if that's what you thought of or if I'm completely on the wrong track!
Brilliant! Same concepts, much clearer implementation. You're very much on the right track, @Moritzs!
@MoritzS just checking in to see how things are going. I'm totally willing to pitch in if you're burned out, or the real world has been too demanding of your time, or whatever. Just want a rough idea where we're at, and when we can expect to see a pull request, since most other paid issues block on this.
I just don't have enough time at the moment, I'll try to do this either this or next week.
That's cool. And I'll try too keep adding paid issues that are orthogonal to this one, like #118 and #102, so that there's still stuff people can contribute on.
In order to keep this project moving, I'm opening it back up to everybody. Whoever has time and inclination to do it, can do it based on the discussion here. I'd love to give @MoritzS the time he needs to work in development time around the rest of his life, but not at the expense of the issues blocking on this one.
I never wanted other people to stop working on this, it's totally fine and even encouraging to have other people do this!
@MoritzS Agreed :) Unfortunately, we haven't seen much interest in getting this done, other than yours, so I've cranked up the bounty to $200. Hopefully I won't have to go higher than that and break the bank/trigger some godawful Paypal fraud detection thing.
Maybe I can start working on this again next week, we'll see.
Sounds great to me! Let me know when/if you have something to look over. On Apr 10, 2013 1:27 AM, "Moritz Sichert" notifications@github.com wrote:
Maybe I can start working on this again next week, we'll see.
— Reply to this email directly or view it on GitHubhttps://github.com/campadrenalin/EJTP-lib-python/issues/66#issuecomment-16161443 .
I'm working on this right now. Give me a little time to go through all our thoughts! ;-)
Sweet! I've threatened to do this myself, but I've been pretty caught up in love-webplayer and getting a proof of concept for DJDNS (which turns out to be much harder than I thought, thanks to some things I didn't realize about pymds, forcing me to make a significant fork before even being able to use it in DJDNS). You'll have plenty of time, believe me, though one way or another this will be done by the end of the month.
On Tue, May 7, 2013 at 11:38 AM, Moritz Sichert notifications@github.comwrote:
I'm working on this right now. Give me a little time to go through all our thoughts! ;-)
— Reply to this email directly or view it on GitHubhttps://github.com/campadrenalin/EJTP-lib-python/issues/66#issuecomment-17561450 .
I made two new commits, that I'll explain here.
MoritzS@a390e6a7d641e16470a3bca06f4b92e7ad48f8cf:
nobind
thing to bind
and moved it to BaseJack
, therefore the property router_key
now also lays in BaseJack
BaseJack
got the methods link
and unlink
, and I changed the methods for loading jacks in the router a bit. They now use jack.router_key
for example. See the docstrings for more informationMoritzS@38776b6c6a061ecd47e059a47c43d547279c4b9f:
In this commit I basically just implemented the listening for data and threading stuff for the ReaderJack
.
Please tell me if that goes along with your ideas!
This looks really good so far. Clean and well-documented, and goes along with our design (at least as far as I've been able to refresh my memory, it's a long conversation :) ). Excellent work!
I'm currently trying to implement the new style tcp jack and got some more questions.
I suppose a WriterJack
should normally have set bind=False
, right? For example a single tcp (or tcp6) socket can connect to multiple addresses. But should these connections be (openend and) closed right (before and) after sending a frame over it or should the stay somewhat longer to make transmission faster in some cases?
Right now I don't have any idea of how to implement the persistent connection thing, so if that is what you want, I'll appreciate some input!
Jacks should hold onto the underlying sockets until they break. How they are stored and tracked is an implementation detail that's up to jack authors - in this case, you. But for the sake of example, you could just have a dict keyed on address tuple, with holds the socket objects as values.
class SomeWriterJack(...):
def __init__(self, ...):
# ... call parent init and such
self.sockets = {}
def register_socket(self, s):
self.sockets[s.getpeername()] = s
That's actually probably a common enough storage pattern to justify implementing it in some base class, rather than redoing it over and over in the subclasses.
But now it seems a bit redundant to not have the WriterJack
to bind to an address on the one hand but still managing different addresses within the jack on the other hand.
In my opinion it would be more consistent to set bind=True
in the TcpWriterJack
. The router will create a new one for each address he needs, anyway. Setting bind=False
makes more sense in protocols like UDP.
My understanding, which I admit may be rusty and wrong, is that bind should be true when there is a consistent local address to the connections, and false when it's not. So for TCP Writer, the local end is just going to be have some random port that was free (thus bind=False), whereas the TCP Reader binds to a specific addr/port as a server (therefor bind=True).
This is totally independent of whether a Jack keeps track of the sockets it owns. For TCP Reader, new sockets will come from remote processes connecting in. For TCP writer, new sockets will come from the router wanting to be connected to remote processes. Either way, you keep a dict of sockets, and provide the Router with Connection objects that use those sockets.
This might be true for ReaderJack
(it also has set bind=True
as default), but in my opinion it doesn't really make sense to talk about local addresses when sending data. Besides we (will) also have jacks that work totally different in comparison to IP protocols, so we should find an unified definition of what "bind" means.
For ReaderJack
, I assume it's kind of obvious, since the analogy to TCP server sockets makes sense even with other types of jacks. (e.g. a Sqlite jack will have to bind to a database to read incoming frames)
For WriterJack
it's a good idea to look at what the attribute bind
was initially supposed to cause. As far as I remember its main use was to distinguish whether to return the full jack address in BaseJack.router_key
or set the last field of the address to None
. This will help the router to decide whether to create a new WriterJack
on the fly with a new address or use a existing one. Therefore there already is a place where different addresses of the same jack type are handled, so I don't see why we should add this logic to the jacks themselves.
I think this might be a good motivation to rename the "bind" variable to something more unambiguous and descriptive, like "has_local_address". That'll help us work out the rest of this mess with less cognitive tangle.
Yeah that's a good idea. Apart from that, do you agree with my suggestions?
It seems like it's probably mostly right, if I understand what you're saying, but there really only needs to be one WriterJack instance, maximum, per protocol. Both kinds of jacks basically exist to create Connection objects to feed the router. While you will need a ReaderJack per address you serve on, you only ever need one WriterJack to initiate connections to the outside world. The Connection objects abstract away the transmission details for the router, so it can treat each of them as a bidirectional pipe. Does that make sense?
Also, sorry for the high latency in getting back to you on this stuff. I really want to dig into the code with you in a collaborative environment, so I can show you what I'm talking about, instead of trying to mold it into words like sun-baked playdoh, or procrastinating about replying in the hope that it'll be easier to turn the ideas into words later (which it never is).
I think the closest we could get is to both work on the same branch in parallel (moritz/new-jacks and campadrenalin/new-jacks, for example), so that when one of us is stuck or confused, the other can express the solution as mergeable code. Let me know if that sounds good to you too.
Yeah that's a good idea! It's better to talk about written code rather than just ideas!
Alright, cool! Happy to help.
I'll be in Philadelphia until Friday afternoon, though, and as a west-coaster, a lot of that will be on a plane (or at least will feel that way) so I'll get on this when I get back, but I'll be working on pymads during my small amounts of free time. Due to external, still-undiagnosed regressions in the build or test system, EJTP no longer builds at all on my laptop, so it's just as well that my head is in other projects while I'm living laptop-only! On May 22, 2013 2:48 AM, "Moritz Sichert" notifications@github.com wrote:
Yeah that's a good idea! It's better to talk about written code rather than just ideas!
— Reply to this email directly or view it on GitHubhttps://github.com/campadrenalin/EJTP-lib-python/issues/66#issuecomment-18268097 .
I finally got some of my Connection class ideas into the code, as well as a bunch of general cleanup, though it's still gonna be a godforsaken mess for awhile. Refer to #135 to see what I'm talking about.
I won't have time to look over it in detail until Sunday, but I had a quick look and I can definitely say that I understand your ideas way better now!
I know, right? Code is always clearer!
Still need to figure out how to deal with the colliding taxonomies of IPv4/IPv6, Reader/Writer, etc, especially with regards to registration. I've got half a mind to make Reader vs Writer an initialization parameter, rather than a class thing. So there's a ways to go before this is useful, and I don't think it will make this EJTP release either. But the next one? Definitely.
So I have some questions now.
Why did you move the send and recv methods out of ReaderJack
and WriterJack
into BaseJack
?
Are Connections supposed to be client-wise (like a TCP-client) only, or should they also be able to be incoming connections (like a TCP-server), or even both at once?
Besides, I really like the idea of having the router handle Connections instead of Jacks!
It's been a busy day, so I haven't had a good opportunity to give these good solid thinking, but I'll try to hammer out some thoughts now.
As I understand your suggestions, you still mix up the concept of Jacks, as a gateway to the internet (similar to an ethernet interface) but without further knowledge of addresses and such, and Connections, as a layer on top of the Jack that is aware of addresses, clients, servers, etc. Specifically, this means the following:
Jack
instance should "know" how to establish connections, so
For the router this means, that it will never instantiate a Jack class directly but use the (yet to be implemented) registration system, that will look at the address, instantiate a Jack class and return a connection.
Correct. If I recall correctly, I'm at the point in development where I'm actually trying to register and use jacks. Not sure if I pushed that or just committed it.
And also, you're right, I'm confident about the role of each piece in the system but the terminology is fucked :) At least for jacks. I don't know how to improve it though, so I'm open to suggestion/wild brainstorming. On Jun 6, 2013 1:26 PM, "Moritz Sichert" notifications@github.com wrote:
As I understand your suggestions, you still mix up the concept of Jacks, as a gateway to the internet (similar to an ethernet interface) but without further knowledge of addresses and such, and Connections, as a layer on top of the Jack that is aware of addresses, clients, servers, etc. Specifically, this means the following:
- a Jack instance should "know" how to establish connections, so
- WriterJacks know how to establish connections, that send data
- ReaderJacks know how to establish (i.e. listen for) connections, that read data
- the Connection class actually cares about addresses, senders and receivers For the router this means, that it will never instantiate a Jack class directly but use the (yet to be implemented) registration system, that will look at the address, instantiate a Jack class and return a connection.
— Reply to this email directly or view it on GitHubhttps://github.com/campadrenalin/EJTP-lib-python/issues/66#issuecomment-19071776 .
Tell me, what you think about this mock-up!
I think jacks should get a dynamic registering structure like I explained for Frames in #65.
Furthermore it would make sense to be able to make Jacks read or write only. For example a tls-jack will only need a ca-certificate if it's read only but will also need a proper certificate and key if it's acting as a server.