Open ewels opened 3 years ago
This could work but I fear it would introduce too many moving parts on NF execution above all considering we to run in a portable manner across different platforms.
Above all, why allowing port 25 would be safer than port 80?
I mean I think it should be possible to allow inbound traffic only targetting your tower hostname and check the http request contains the expected auth header. I don't see I this could be weaker than allowing traffic only ssh.
For my personally, the selfish reason is that this would make installation easier. Opening custom ports and configuring firewalls requires lots of back and forth with sysadmins, additions to our existing security validations and all kinds of other overheads. This becomes even more extreme when working on a shared HPC system as many are.
I appreciate that on your side, the simpler things are the better. But it sounded like this would be a relatively simple tweak that could be opt-in? But of course I have no idea really 😄
Regarding the security aspect, maybe @pontus has some views on this? I am not really qualified to speculate.
Hrrm, now that I'm rereading this, I might not have completely understood the original query from discussing with Phil - I read it at the time as reaching a server on the remote system (the one being sshed to), whereas it now seems the thing desired is feedback from the remote system to the local (where the outgoing ssh is run).
If this is the case, -R
is the one to use if one uses OpenSSH.
Nowadays, that also supports listening on UNIX-sockets remotely, which would be useful as one can use a random socket name (and get away from collision risk).
As for the security aspects, I agree that in theory, opening for an egress shouldn't be as big of a problem. Administratively, it can still be quite an issue (and I know of at least one place there, while possible, one would certainly try to avoid the administration involved with getting that opened).
But I also note that the proposed tunneling (as I understand it now, e.g. primarily for connections from where nextflow is run to where the tower is run) opens for running the tower server behind nat/filter rules without specific passthroughs/openings. That would seem a big plus to me.
One downside of this of course is that any monitoring of pipelines that are manually launched on the HPC system would be impossible - workflows would have to be launched via Tower to work with Tower.. 🤔
I don't see why - supporting this would reasonably be possible by providing an alternate URL/address to connect to.
But Tower would need to have an active SSH session at all times, right? I'm assuming that it only opens these when running a workflow, so if all workflows ended then there would be no active SSH sessions to tunnel.
It would need an active connection for feedback to work, yes. I don't see any reason it couldn't keep an active connection, though. (I'm not sure if ssh in tower is implemented on libraries or using externals like OpenSSH, but it's perfectly possible to have just a forwarding without any shell process, OpenSSH has e.g. -N
for this).
This feature can be useful, however at this time Tower does not maintain a state or an active agent on the computing node(s), therefore it's not an easy task to implement this. Not sure we are providing this in the short term.
Ok, thanks @pditommaso 👍🏻
Does all of this web traffic go to a single Tower URL? I'm thinking that we could open up the IP range but only allow it to hit a single URL. This doesn't help with needing to configure the HPC for outgoing traffic, but does help negotiations with the receiving end.
Phil
Yes, NF only uses https://your-host/api/trace/*
. I think you can also filter requests checking that contains the header Authorization: Bearer xyz
.
Background
When using Nextflow Tower to launch runs, it uses an ssh login to access the server. Once the workflow is running, status updates are sent back to Tower via HTTP requests.
To either host Tower behind a strict firewall or use a HPC that is locked down, this means two things have to be done: ssh account and opening ports.
Suggestion
As we already have an active ssh session, we could tunnel web traffic for Tower over specific ports using the
ssh -L
orssh -D
options. To quote @boulund and @pontus:My hope would be that Tower could use these options in its SSH connection to tunnel the nextflow monitoring traffic and remove the need to open the tower server / cluster to general web traffic.