Closed takluyver closed 8 years ago
I agree that using filenames without tab completion can be really awkward in some situations and thought of using tokens, too. I went with only filenames for now because
Ask your friend
line might be annoying)cp
/wget
-like command line interfaceaaaa
, potentially compromising securityHowever I would definitely be willing to implement a token system if it played nicely with these principes and were an optional additional feature.
My first idea was to allow a second zput
argument: zput my_pics.zip AFGD
which would then be used with zget AFGD
or zget AFGD foo.zip
. That would have a nice CLI API (both commands work like cp
) but it could be a bit nasty because
AAAA
Just to give you a broad overview over how the Zeroconf peer finding works: There is no initiator. Both parties initiate their part of the procedure, no matter if the other already did theirs. After both did the Zeroconf procedure, the recipient will at some point pick up the broadcast from the sender.
From this broadcast the recipient will extract the peers IP address and port and simply do a http://<ip>:<port>/<filename>
request.
This means that when you give either party the possibility to create a transfer token, you will potentially have a filename and two tokens to deal with (no problem, just sth to be aware of).
I agree that four characters (32 Base, case insensitive) should be sufficiently safe and seem a pretty good fit for our usecase. Generally though, I would prefer to advocate the SHA1 of filenames and tokens on the network. When a match was found the recipient may then have the choice to request either a token (like http://<ip>:<port>/<token>
) or a filename (like http://<ip>:<port>/<filename>
) from the sender and then be handed the file.
Thanks, that makes sense. A couple of refinements to the idea:
--porcelain
, for instance, and Jupyter has --json
for some commands.ls
does it) , though I prefer having an explicit flag for machine readable output.my_holiday_pics_1.zip
My mistake on the implementation - looking at the examples, I assumed that whichever one you started first was creating an 'advertisement' of some kind which the second one responded to. On looking at the code some more, I realise that it's always the sender that advertises and the recipient that responds (by making an HTTP request). I like this design better than what I thought was going on :-).
I'm not convinced about using sha1 hashes as a security measure. A malicious actor on the network could easily generate a mapping of sha1 hashes to possible tokens or common filenames. When they see a broadcast of a sha1 hash, they could then quickly look up the corresponding token/filename and make a request for that. Security probably isn't a major design goal, as it's for trusted networks, but if there's a shared secret in the architecture, I'd rather it had no relationship to the advertised data at all.
Would you be happy for me to have a go at implementing tokens, and make a pull request for it?
I've started working on a first draft that does
zput file.jpeg asd
will allow you to download using zget file.jpeg
or zget asd
. In case of zget asd
it will still create a file file.jpeg
(or file_n.jpeg
if file.jpeg
or file_n-1.jpeg
already exist).zget file.jpeg
will not overwrite an existing file.jpeg
, however zget file.jpeg file.jpeg
will.What do
zput file.jpeg a
.zget file.jpeg -o file.jpeg
, or by writing to stdout zget file.jpeg > file.jpeg
.I'll have a go at implementing my idea, so we can have a play with it.
In case of a simple/short token I thought about showing a warning "insecure upload token, your transfer may be highjacked" but still allowing them. When the user didn't set one we can provide one for them.
I was playing with the idea to allow zget file.txt -
to print the contents to stdout. I guess the -o
option is more curl
-style syntax, the second filename more cp
style
I have implemented a simple version of this in the latest release, you can upgrade it using pip install -U zget
.
We will think about security in a later release and a separate, issue.
I just saw a talk about a similar tool called magic wormhole - it creates a code made of randomly selected words from a list, and then it uses a neat algorithm called PAKE to turn that weak code into a strong encryption key which it can use to transfer data.
Very interesting, and not infeasible to implement here.
I've looked at zget a few times, and I think it's a really neat idea. But it's always struck me as a bit awkward to read out filenames:
It's also not obvious whether it's case sensitive, and if you have a filename with spaces in, the recipient has to mess around with quoting/escaping it, with no help from tab completion.
How about producing a short random token instead? E.g.
Or, with the receiver initiating:
Using one of the base 32 alphabets, you can avoid similar looking characters like O/0 or I/1. With 4 characters, that gives you ~1 million possible codes.
To preserve the security model, I imagine that half the token would be advertised on the network, and the other half would be the authentication token that the second party (whichever didn't advertise) needs to send.
I think this would also reduce the risk of collisions - although the space of possible filenames is large, people don't sample that space at random, and in a busy office it's possible that two people could be zput-ing the same filename at the same time. With random codes, the risk of collisions is a function of the number of transfers being advertised. Advertising 2 base 32 digits as I propose means collisions are expected when 38 files are being advertised at once (if I've understood the birthday paradox correctly). If that's insufficient in large networks, the number of digits advertised could easily be increased by a config setting.
Thanks for creating zget :-)