JustinTimperio / warp-cli

A CLI tool designed to make interacting with Facebook's Open Source Library "Warp Speed Data Transfer" fast and pain-free.
MIT License
62 stars 3 forks source link

Some Observations #6

Open Teque5 opened 4 years ago

Teque5 commented 4 years ago

I've been trying to decide if this is ready for production use, and I have some thoughts.

If I was doing this I would layout warp-cli as a python package like this and in the setup.py you would add an entry_point:

setup(
    # other stuff
    entry_points={
        'console_scripts': [
            'warp = warp:main',
        ],
    }
)

Then you could simply do a pip install warp-cli and exit early from the installation if the libraries you needed weren't found. Otherwise it would install warp normally in the user's ~/.local/bin or wherever.

If you'd like I can create a PR that starts going in this direction, but I didn't want to attempt if you weren't interested.

JustinTimperio commented 4 years ago

Hi Teque5,

Thanks for the detailed observations. You bring up some interesting points which I think are worth consideration. I find the idea of 'production-ready' an interesting idea, but ultimately dangerous for some reasons I'll discuss below.

  1. I have only ever used Python 3 so I hadn't considered this. I agree that calling /env python3 is a much better idea.

  2. I have not experienced this yet, but this is most likely an issue with the WDT-CLI itself. I designed Warp to gracefully exit during an internal error but it can't detect when an os.system() call is stuck. I would guess that Warp is building the command correctly but WDT itself hangs during an error. This could only be fixed by modifying the underlying WDT-CLI. I'll cover why this is a bad idea in 7.

  3. Basically the same issue as 2 but slightly more fixable. It would be pretty simple to loop over a target dir and look for any permission conflicts before a transfer. I'll look into this but I think there may be some unavoidable errors when transplanting directories between machines.

  4. This is an extremely good idea given how much version control is happening to make this work. I'll add some functionality for this.

  5. Same Issue as 2 and 3. While it is possible to suppress the heartbeat and transfer updates, the WDT-CLI will always output logs during the transfer startup and close sequence.

  6. While I agree that it is quite painful, I do see this as a fundamental feature of Warp. Automating the build process across distros makes the transfer protocol far more accessible and easily deployable. I think asking for user input if they want to attempt an automatic install would be a reasonable compromise here.

On Proxies and Custom Firewalls in Production Environments A few months ago I messaged one of the main contributors to WDT with some questions about how routes were negotiated. My machines all have multiple NIC's and my plan was to build a solution to multi-cast my transfers over each NIC. Unfotuently, because WDT is its own protocol, they did some hacky things to negotiate the path automatically. If certain port ranges are blocked or WDT fails to negotiate a path after receiving a WDT URL, the process just hangs until the thread pool merge exits on a failure. Laurent Demailly messaged me the following:

it’s interesting yes i would have to look at the code again but believe this case didn’t come up before and for client socket we just use the default and let the OS route to the destination. you could setup some iptables or let the os continue to distribute client sockets across your physical interface however it does today (I expect it’s done using a hash of the source port, like SDN) or add an option in wdt client code to bind specifically

Because these routes are auto negotiated I have run into extensive issues transferring over and between production environments with network security. This is only an issue that can be resolved by building a real WDT interface to the C++ lib. (See 7)

  1. This finally brings up the point that ultimately decides 2, 3, and 5. Warp as I see it, is not a generic interface to WDT. Warp is built around generating commands for a relatively barebones cli. Warp-CLI is a hacky wrapper around another hacky wrapper making calls to a C++ lib written in "moderately structured and encapsulated C". Even Facebook docs note:

    While WDT is primarily a library, we also have a small command line tool which we use for tests and which is useful by itself.

To properly turn this into a pip package, a proper generic C++ interface would need to be written to connect to the WDT lib to python. This would expose the full functionality of WDT and would fundamentally solve all the issues you bring up. In this context error handling, file permissions, routing, logging, versioning could all be handled in a unified way. This would also add a dramatic degree of customization and functionality.

I would be very interested in developing a tool like this but I think some serious work would need to be done to understand how to properly use the C++ lib.

JustinTimperio commented 4 years ago

So I just released a new version that addresses some of the issues you brought up. https://github.com/JustinTimperio/warp-cli/tree/v2.1.3

Addressed so Far

Working On