NebulousLabs / python-skynet

Library for integrating Skynet with Python applications
MIT License
19 stars 11 forks source link

Fix up chunked uploading #36

Closed xloem closed 3 years ago

xloem commented 4 years ago

Somehow in transitions it looks like the chunked uploading feature broke.

This change fixes it up again.

Chunked uploading is done by passing an iterator as the body data; see https://requests.readthedocs.io/en/master/user/advanced/#chunk-encoded-requests

I believe you can also pass a file-like object to stream with known size for #30

mrcnski commented 4 years ago

Hmm should we keep the path parameter? If custom_filename is not passed then the filename is empty. It would be consistent with the Go SDK where we require a filename when uploading generic data.

xloem commented 4 years ago

The existing tests are uploading a 5B file and verifying the upload matches the skylink for a 156KB pdf ... this does not seem correct ... EDIT: learning about responses.activate

xloem commented 4 years ago

Whew! Confused myself with tests

Hmm should we keep the path parameter? If custom_filename is not passed then the filename is empty. It would be consistent with the Go SDK where we require a filename when uploading generic data.

I've made the interface change you requested, but I'm noting that this changes the interface from how chunked uploading used to be here.

I like the way the go library takes a map of filenames to content: do you think that would work here?

mrcnski commented 4 years ago

I like the way the go library takes a map of filenames to content: do you think that would work here?

That would be excellent! Ideally the SDKs are consistent with each other, especially the APIs, but I am short on time. So these kinds of fixes are really appreciated!

xloem commented 4 years ago

Looking at this some, I'm noticing that chunked uploading (which wonderfully supports live streaming uploads) works for single file uploads only with the requests library. Additionally the go SDK does not have a feature for chunked uploads. It seems like supporting generic upload, which the go SDK has normalized, is a separate beast from supporting chunked upload. It might not be hard to upload an iterator as chunked if only one file is provided.

xloem commented 4 years ago

@m-cat I've squashed and rebased and simplified to addition of a method comparable to the generic one used by the go sdk. It takes a map of filenames to file-like objects or bytes data, and if there is only one file can do chunked uploading via the requests library's feature, if the file data is an iterator. Does this look good?

mrcnski commented 3 years ago

Sorry for the delay in reviewing, we were super busy. Taking a look now.

mrcnski commented 3 years ago

Just ran into an issue with the Go SDK that I think applies here too. If we upload a directory with a single file, it will be treated as a file upload instead of a directory upload. This can be unexpected for the user, and a directory should be uploaded even if it's only a single file.

Edit: I'm fixing this in the Go SDK by checking if opts.CustomDirname is set -- if it is, we upload as a directory.

xloem commented 3 years ago

@m-cat I believe I've added all the changes you requested.

I'm testing this locally and I discovered that python requests doesn't stream file objects when they are passed as multipart data (it reads the whole content of each file in one go). If you pass a single file as the body of a post request, it does stream it, reading it in a configurable chunk size defaulting to 8K.

Would it be reasonable to change the behavior of python-skynet to pass all single files directly as body content? This would simplify the implementation and provide streaming for all single files. The tests would change to expect this.

mrcnski commented 3 years ago

Published version 2.1.0 which includes these changes 👍