[RFC] cli: specify endpoints in a SSH-like way

meejah / fowl

Forward over Wormhole: streams over magic-wormhole Dilation connections

MIT License

29 stars 2 forks source link

[RFC] cli: specify endpoints in a SSH-like way #11

Closed balejk closed 9 months ago

balejk commented 1 year ago

This is a rather proof of concept implementation of a simple command line interface based on SSH. It allows the invoker to specify binding commands on the command line and these then get executed right after the connection is established. It is still possible to specify more bindings via JSON commands afterwards.

One weak spot might be that it assumes the format of the arguments to be proto:port:proto:address:port, while I believe that the JSON way allows for more formats. This was done to preserve the similarity to SSH, however it will probably make more sense to twist the format away from it, for instance by separating the endpoints by some other character than colon or by splitting these into two CLI flags in parallel with what the JSON interface does. Once this is resolved, it would be nice not to forget to add help and format hints to the CLI options.

meejah commented 1 year ago

Thanks!

Quick additional idea as I'm thinking about it: this could be structured as a separate CLI space (i.e. to keep these options from "polluting" invite/accept) and simply start the underlying fowl invite (or fowl accept) and feed it JSON commands. Maybe "fowlssh" or something? Another possible advantage here: the "forwarding" options will be the same whether you're "invite"-ing or "accept"-ing

Overall, I like the idea a lot: opens it up for direct human use. (The reason I wrote this was for a GUI piece I'm working on, so was concentrating on this as being used via "some other software" by subprocess).

balejk commented 1 year ago

Thank you for your feedback!

Quick additional idea as I'm thinking about it: this could be structured as a separate CLI space (i.e. to keep these options from "polluting" invite/accept) and simply start the underlying fowl invite (or fowl accept) and feed it JSON commands. Maybe "fowlssh" or something?

That was somewhat my original idea, the "feed it the JSON commands" part. What I did was the next simplest thing after having to run fowl as a subprocess (I definitely didn't want to do any hacks like have it send the commands to itself via stdin). It did not really occur to me that running it as a subprocess is an option.

I understand your point about polluting the CLI space, but I am thinking that perhaps it might then be good to first decide, how the program should be structured. This is just an idea, but if you wanted to have a separate executable for the user interface (such as these SSH-like flags, so for instance fowlssh as you suggest), then perhaps it would make sense to move the other options there as well and have the fowl command communicate exclusively via the JSON commands - no? So even for allocating the code and other operations.

I was also thinking that it might be good to drop the the accept/invite distinction altogether as you have mentioned elsewhere that it's really a symmetric operation and you have only defined these commands because it may be more intuitive to the user (the code is really the same anyway). It would then be possible to just run fowl and specify the code via the JSON interface or request its allocation.

If this hard split was indeed to happen, then it might also be worth considering whether you want to ship any human-use wrappers or just concentrate on the JSON interface and let the users to write wrappers fitting their needs (possibly collecting some common ones in say contrib subdirectory - fowlssh should really be just a few lines of shell script) - this would definitely require the documentation to be updated first.

My thoughts were also in the direction that there may be a flag disabling the "machine" (JSON) interface and making fowl behave like a more human-friendly application (you are even mentioning a TUI in your blog post) and vice-versa. But that would definitely complicate the program a lot I think, so the "wrappers" way seems more sensible to me now.

Another possible advantage here: the "forwarding" options will be the same whether you're "invite"-ing or "accept"-ing

Are they not already?

Overall, I like the idea a lot: opens it up for direct human use. (The reason I wrote this was for a GUI piece I'm working on, so was concentrating on this as being used via "some other software" by subprocess).

Yes, my motivation was that I still struggle to put together the correct JSON command :D. But really, your feedback got me thinking, whether this really belongs here and whether writing wrappers should rather be left to the users or at least whether they should be shipped separately from the main program.

meejah commented 1 year ago

It's totally possible to ship two (or more) CLI programs/commands in one binary.

Your points above are good, too (i.e. perhaps only having "fowl" + JSON, or the "nicer" front-end).

It could still be "one" command (e.g. a fowl daemon is the JSON-accepting subcommand, maybe).

Let me think about this a little further and sketch out a skeleton? I'm not concerned with "backwards compatibility" at this point. I do think it makes some sense to ship the "nice" CLI with this program (although a TUI is probably a separate codebase?). I did actually start a TUI (as well as a GUI) in Haskell.

So, if you're happy to keep sketching in this PR I will merge something-like-it, the only question being how exactly it's spelled (i.e. where the options live, and if there's two entry-points in this codebase or just one).

To clarify:

Another possible advantage here: the "forwarding" options will be the same whether you're "invite"-ing or "accept"-ing

Are they not already?

Yes. All I mean was that this to me means the flags should live at the "top level" fowl -L ... accept instead of like fowl accept -L ...

(I'm leaning towards making fowl the "user-friendly" program, and putting the JSON-accepting thing into like fowl-daemon or something? But, will think more on it today ... thanks!)

meejah commented 1 year ago

regarding fowl accept vs. fowl invite that's already really a "human" thing too (so I do like your suggestion to move that to a JSON command for the command-accepting thing).

balejk commented 1 year ago

It could still be "one" command (e.g. a fowl daemon is the JSON-accepting subcommand, maybe).

Yes, that's what I meant with the CLI flag, but I guess a subcommand would be cleaner as it would be a major behavioral change.

Let me think about this a little further and sketch out a skeleton? I'm not concerned with "backwards compatibility" at this point. I do think it makes some sense to ship the "nice" CLI with this program (although a TUI is probably a separate codebase?). I did actually start a TUI (as well as a GUI) in Haskell.

Sure, I will be happy to discuss it if you want to. I really opened this PR more to open a discussion rather than use the code as is, although for my needs it's currently sufficient.

Ad the TUI: I am not sure that I perceive it as necessary (I haven't looked at your Haskell code yet though). However what is next on my wishlist is some permissions system in context of which I believe you originally proposed the TUI in your blog post. On the other hand, this should be doable using the plain JSON interface as well, no? And probably even necessary if the TUI would not be part of the program.

So, if you're happy to keep sketching in this PR I will merge something-like-it, the only question being how exactly it's spelled (i.e. where the options live, and if there's two entry-points in this codebase or just one).

Of course, please feel free to edit this PR as you see fit.

To clarify:

Another possible advantage here: the "forwarding" options will be the same whether you're "invite"-ing or "accept"-ing

Are they not already?

Yes. All I mean was that this to me means the flags should live at the "top level" fowl -L ... accept instead of like fowl accept -L ...

Sorry, I'm still unsure whether you were talking about your idea or my proposal. In my proposal, the flags already live at top level, so are the same for accept and invite.

(I'm leaning towards making fowl the "user-friendly" program, and putting the JSON-accepting thing into like fowl-daemon or something? But, will think more on it today ... thanks!)

I still rather have the idea with wrappers in mind and fowl being just the JSON "daemon". And personally I currently prefer having this named fowl and using different names for the wrappers, as the "daemon" would still be the heart of the project. But it's completely up to you of course.

Speaking of daemons, IPC via socket might also be a nice future feature.

regarding fowl accept vs. fowl invite that's already really a "human" thing too (so I do like your suggestion to move that to a JSON command for the command-accepting thing).

I will try to sketch some of my ideas regarding this into another PRs if I can find the time. But basically what I originally imagined is that you could either run just fowl, optionally with the --code-length flag, which would allocate a free code, or you could run fowl <code>, which would allocate this code or try to establish connection if the nameplate was already in use. So here I wasn't really thinking about doing it completely via JSON, because it's something, that you would only do once - when running the program. Of course, if you wanted to have the ability to use multiple wormhole codes for multiple connections (with different hosts), then it might make sense to make it part of the JSON interface. But really, even in this case, I would prefer just running fowl multiple times, else it might get complicated really soon.

meejah commented 1 year ago

However what is next on my wishlist is some permissions system [..]

Yes, I'd like permissions, and yes I imagine it would be some additional JSON messages (because if there is a GUI in front, it will likely want to interact with the user, like "X wants to listen locally on TCP:8080, okay?" or whatever).

For the CLI-based thing I would imagine that one side would include relevant arguments to set up a listener, while the other side could have args specifying what's allowed there (e.g. "--allow-listen 8080" or similar ... or even "--allow-all" which is what the situation is now).

Speaking of daemons, IPC via socket might also be a nice future feature.

Yes ... although there's some weird security considerations there, and the "mission" of fowl is to let programmers from any language easily play with wormhole + Dilation (so I figured that "stdin + stdout" is easier than opening sockets). The "weird security considerations" are that it's easier to think about the permissions of stdin/out for a subprocess -- if fowl opened a localhost:1234 listener (for example), malicious local programs could race it (i.e. listen on 1234 first, possibly tricking the "parent' program).

That said, a unix-socket could work well here. (and just speak the same JSON protocol).

if you wanted to have the ability to use multiple wormhole codes for multiple connections (with different hosts), then it might make sense to make it part of the JSON interface. But really, even in this case, I would prefer just running fowl multiple times, else it might get complicated really soon.

Yes, :100:, on the same page here: I do have use-cases for multiple connections, but would launch multiple fowl instances. One program, one connection is easy to think about :)

p.s. if you have specific use-cases it would be great to write them down!

meejah commented 1 year ago

Hopefully https://meejah.ca/blog/wizard-gardens-vision puts this in somewhat more context?

balejk commented 1 year ago

Interesting blog post -- do you have an RSS feed set up for the blog?

For the CLI-based thing I would imagine that one side would include relevant arguments to set up a listener, while the other side could have args specifying what's allowed there (e.g. "--allow-listen 8080" or similar ... or even "--allow-all" which is what the situation is now).

Hmm, so for the CLI app you would need to know all allowed ports beforehand? If so, it might be another argument for sockets: instead of keeping fowl as a subprocess, it could print it's socket path and live on its own. CLI might the just need to be invoked with the path to the socket (which could be made into enviroment variable, à la ssh-add) and it could be run multiple times, each time connecting to the same socket.

Yes ... although there's some weird security considerations there, and the "mission" of fowl is to let programmers from any language easily play with wormhole + Dilation (so I figured that "stdin + stdout" is easier than opening sockets). The "weird security considerations" are that it's easier to think about the permissions of stdin/out for a subprocess -- if fowl opened a localhost:1234 listener (for example), malicious local programs could race it (i.e. listen on 1234 first, possibly tricking the "parent' program).

That said, a unix-socket could work well here. (and just speak the same JSON protocol).

I in no way meant a network socket -- that I would be paranoid about too. I really meant just an unix socket. And indeed, that it would speak the same JSON protocol was also what I had in mind. It would solve for instance what I mentioned above, plus it's way more natural to have a wrapper translating the socket communication to stdin/out if needed than the other way around.

On the other hand, this could introduce some portability restrictions which would not be desirable. Does fowl run on Windows at the moment, actually?

p.s. if you have specific use-cases it would be great to write them down!

I would eventually like to use fowl to interconnect my devices which do not have a public IP and some of them are constantly on the move. In particular, I would mainly like to tunnel SSH through fowl. I am hoping that some NAT traversal/hole punching might be implemented into Magic Wormhole eventually, as SSH trafficked through the transit relay does not perform optimally even with compression enabled. At the same time, I used the Rust implementation (for file transfer, not port forwarding) which does this and it was really successful in getting a direct connection.

meejah commented 1 year ago

Interesting blog post -- do you have an RSS feed set up for the blog?

Yes, https://meejah.ca/atom.xml -- should link it somewhere! (When I did that, Firefox would show an RSS icon in the url-bar .. sadly that is gone now).

meejah commented 1 year ago

Hmm, so for the CLI app you would need to know all allowed ports beforehand

Yeah, in my mind there it was like "human CLI runs the daemon" (and communicates via stdin / stdout).

I don't particularly care about Windows myself, but it would be best if it worked on more systems -- so that points to having the daemon as the "lowest level", with a unix-socket wrapper on top. (This could instead simply be an option to the daemon: listen on unix-socket, or on stdin/stdout -- most general for unix-y OSes I guess would be to enable passing filedescriptors to it).

So, I actually like both of these ideas!

Let's think out loud, then, and assign some provisional names (edit: switched program names!)

fowl-daemon: aka "the daemon", the thing we have now basically that "does" the wormhole stuff and speaks a JSON protocol on stdin/stdout (but minus "fowl accept" etc -- it's JUST the daemon). Might grow a "listen on unix-socket path" so facilitate below.

fowl: runs fowl-daemon as a subprocess and listens itself on a unix socket. Could accept a --daemon /path/to/unix/socket option, in which case it speaks through a copy of fowl-daemon. Otherwise, it runs its own fowl-daemon subprocess on stdin/out. This is where the "ssh-style" port options and stuff that you're interested in live. So, if you give it "--daemon ..." then you can have the "oops, I changed my mind and want another port opened" features. If you don't, then it's one-shot (and yes, you'd have to decide which ports are okay to open etc).

Off the top of my head, I don't have a good way to do the "permissions" side for a CLI like this. I suppose a naive take would be to simply ask "y/n" style questions on the fowl stdin (e.g. "other side wants to listen on TCP port 8080, okay (y/n)?"

What about GUI support? Originally I was thinking a GUI would just run "fowl" and interact via stdin/out -- but with the above tools, it could instead be given the path to the unix socket --- which of course opens up a bit of a can of worms we glossed over above: if you have a socket, more than one client can connect (which isn't really true of stdin/out).

For most commands, that's fine. Maybe it's also fine for "permissions" (you basically are just racing then: whichever client answers the question first "wins"?) Still, may take some thinking when finalizing the JSON API if it can have >1 clients.

meejah commented 1 year ago

Use-case: "I would eventually like to use fowl to interconnect my devices which do not have a public IP and some of them are constantly on the move. "

This sounds like a great concrete use-case! Want to highlight it here, so it's not lost in the noise as much :)

meejah commented 1 year ago

NAT traversal/hole punching ...

I would certainly take PRs attempting this! (Fun fact: I actually used to work on a VoIP appliance that did this -- we thought IPv6 would be long here at this point, making it obsolete -- sadly not the case!)

It seems like the 'state of the art' here is maybe WebRTC .. although unfortunately that suffers from being pretty browser+js specific (and complex, overall: several RFCs worth!). That said, for simple cone NATs there's some easy tricks that work well (I believe this is what the Rust implementation does?)

Anyway, definitely punching NAT pinholes to increase the chances of a peer-to-peer connection would be a good "win" for users (when it works) -- they can always fall back to transit-relay so no worse than currently!

balejk commented 1 year ago

NAT traversal/hole punching ...

I would certainly take PRs attempting this! (Fun fact: I actually used to work on a VoIP appliance that did this -- we thought IPv6 would be long here at this point, making it obsolete -- sadly not the case!)

It's on my mind that I would like to try to implement it some time, but sadly I don't think it will be any time soon.

It seems like the 'state of the art' here is maybe WebRTC .. although unfortunately that suffers from being pretty browser+js specific (and complex, overall: several RFCs worth!). That said, for simple cone NATs there's some easy tricks that work well (I believe this is what the Rust implementation does?)

I would definitely want to avoid WebRTC for the reasons that you mention. I believe that indeed the Rust implementation does something simpler (there is even some discussion in the MW docs) and I would like to follow its lead if I was to implement it.

Anyway, definitely punching NAT pinholes to increase the chances of a peer-to-peer connection would be a good "win" for users (when it works) -- they can always fall back to transit-relay so no worse than currently!

Definitely. Plus less load on the relay and higher transfer speeds.

balejk commented 1 year ago

I don't particularly care about Windows myself, but it would be best if it worked on more systems -- so that points to having the daemon as the "lowest level", with a unix-socket wrapper on top. (This could instead simply be an option to the daemon: listen on unix-socket, or on stdin/stdout -- most general for unix-y OSes I guess would be to enable passing filedescriptors to it).

I pretty much only care about Windows for the situations, where the durability fails, which has been the case for me almost always so far, because then I could ask someone to patch me back through for at least enough time to re-run fowl. This is also the reason why would like to implement at least some basic permissions soon.

Let's think out loud, then, and assign some provisional names:

fowl: aka "the daemon", the thing we have now basically that "does" the wormhole stuff and speaks a JSON protocol on stdin/stdout

fowl-daemon: runs fowl as a subprocess and listens itself on a unix socket.

fowl-tunnel: accepts a --daemon /path/to/unix/socket option, in which case it speaks through a copy of fowl-daemon. Otherwise, it runs its own fowl. This is where the "ssh-style" port options and stuff that you're interested in live. So, if you give it "--daemon ..." then you can have the "oops, I changed my mind and want another port opened" features. If you don't, then it's one-shot (and yes, you'd have to decide which ports are okay to open etc).

Personally, I would probably go the flags way for enabling the socket, rather than different command, simply because I don't like having multiple executables for similar things. But the way you suggest it may be way more straightforward to implement with the way fowl is built (i. e. feeding it stdio rather then propagating CLI options to the program's core, as we have already discussed above).

I agree with fowl-tunnel, but I think we should focus on the other two for now as this anybody can throw together rather quickly to fit their specific needs (at least once the documentation is updated) -- I myself plan to write a bash wrapper around fowl to emulate the SSH options suggested in this PR (I will then probably change this PR to adding this wrapper to contrib, if you agree).

Off the top of my head, I don't have a good way to do the "permissions" side for a CLI like this. I suppose a naive take would be to simply ask "y/n" style questions on the fowl-tunnel stdin (e.g. "other side wants to listen on TCP port 8080, okay (y/n)?"

Yes, that's something along the lines of what I was thinking for now and I will probably try to do a proof of concept of that.

What about GUI support? Originally I was thinking a GUI would just run "fowl" and interact via stdin/out -- but with the above tools, it could instead be given the path to the unix socket --- which of course opens up a bit of a can of worms we glossed over above: if you have a socket, more than one client can connect (which isn't really true of stdin/out).

For most commands, that's fine. Maybe it's also fine for "permissions" (you basically are just racing then: whichever client answers the question first "wins"?) Still, may take some thinking when finalizing the JSON API if it can have >1 clients.

I think GUI could behave in the same way that you mention for fowl-tunnel.

Regarding the multiple connections to the socket, it also ocurred to me whether that might be a problem and my thoughts are exactly what you write.

Do you imagine one fowl instance could be used my multiple users? Like say a system daemon to which users of a given machine could connect via a shared socket and request forwards to some other machine? I do not consider that a good idea, it seems way beter if, again, everybody just runs their own instance. In that case, I think we would not necessarily have to try to generalize the JSON API for more clients and could probably even rely on the user not attempting to make multiple connections and if so then that they know what they are doing and what implications it has (good to mention this in the docs). Better even, we could try to somehow ensure that there is only one connection at a time (which could add some unrelated (to fowl's main purpose) complexity which would then point in the direction of having the socket-stdio translation in a separate codebase, as you suggest).

balejk commented 1 year ago

Use-case: "I would eventually like to use fowl to interconnect my devices which do not have a public IP and some of them are constantly on the move. "

This sounds like a great concrete use-case! Want to highlight it here, so it's not lost in the noise as much :)

Let me add that being able to reboot the devices in the meantime (or carry two of them offline for some time during which the mailbox server would drop the session) would be a very welcome addition as well, so I'm really looking forward to Seeds.

meejah commented 1 year ago

Do you imagine one fowl instance could be used my multiple users?

Yes, but moreso by "multiple softwares run by one user" [*].

I brought this up in the context of "why not listen on a socket?" and one reason is "because then multiple things can connect" (you could always reject >1 connections, I suppose, but ... then what if the "intended" software is out-raced by some other). Anyway, lots more issues to consider here -- even if it's a user-owned Unix socket.

[*] -- (To expand on the above) per https://meejah.ca/blog/wizard-gardens-vision it would make sense to have "something like" fowl running (once) by a user, and other applications ("the glue code" from the blog post) use it to set up (or tear down) forwarded streams.

That is, there's ONE "trusted" application that "knows how to wormhole" (including Seeds) and other things can integrate with it via local connections (unix, tcp, whatever) so that the user can place their "connect to peers" trust in this one software. Maybe this model isn't perfect, but I think it's a step better than having every application try to integrate magic-wormhole (for example).

meejah commented 10 months ago

Sorry for the giant delay, I'm taking a stab here https://github.com/meejah/fowl/pull/16 and starting from the "other side", i.e. writing the documentation first...

balejk commented 10 months ago

Sorry for the giant delay

No problem!

I'm taking a stab here https://github.com/meejah/fowl/pull/16

I will take a look.

and starting from the "other side", i.e. writing the documentation first...

I think this is a great approach -- I was initially quite confused by the documentation when I first started experimenting with this project as it matched the actual code (such as the JSON messages) very little.

meejah commented 9 months ago

okay, i believe this is represented in the latest release via the --local / -L and --remote / -R options for the human version.

Thanks for the discussion and ideas!

balejk commented 9 months ago

Thanks for the discussion and ideas!

It was my pleasure, thank you too :-)