nstrydom2 / anonfile-api

An unofficial Python Anonfiles.com API
MIT License
62 stars 24 forks source link

Big files (2GB+) can't be uploaded #48

Closed Siege-Wizard closed 3 years ago

Siege-Wizard commented 3 years ago

Describe the bug Uploading files bigger than 2GB is not supported as requests.post doesn't support it.

To Reproduce Steps to reproduce the behavior:

  1. Try to upload a file bigger than 2GB.
  2. Use upload = AnonFile().upload("big.zip", true).
  3. upload is None.
  4. The following output is printed:
    Upload: big.zip:   0%|          | 0.00/2.06G [00:00<?, ?B/s]
    Upload: big.zip: 100%|○○○○○○○○○○| 2.06G/2.06G [00:01<00:00, 1.71GB/s]
    Upload: big.zip: 100%|○○○○○○○○○○| 2.06G/2.06G [00:03<00:00, 664MB/s] 
    string longer than 2147483647 bytes

Expected behavior Files bigger than 2GB should be supported, as anonfiles supports up to 20GB per file.

Siege-Wizard commented 3 years ago

I think this should be achievable by replacing: https://github.com/nstrydom2/anonfile-api/blob/8632c88df84bc1cd188ecbd0c0f2f4a9a2f3fdaa/src/anonfile/anonfile.py#L230 by:

                    data=MultipartEncoder(fields={'file': CallbackIOWrapper(tqdm_handler.update, file_handler, 'read')}),

and adding the import:

from requests_toolbelt import MultipartEncoder

I did it in a single line, but feel free to adapt the style to your own. Source

hentai-chan commented 3 years ago

Thanks for raising awareness to this issue (and the subsequent proposal to resolve this problem)! Let me know if you want to make a PR to address this issue, else I'll test and integrate your solution tomorrow, the release to PyPI might take a few more days. What do you say @nstrydom2?

PS: Note that the max. file size is capped at 5GB. The doc strings currently state that

Uploads cannot exceed a file size of 20G

but I think this was either 1) a mistake on my part 2) changed recently. See also the screenshot attached below, taken from the API docs.

post

Siege-Wizard commented 3 years ago

I actually think that what is outdated are the API docs and that the limit is 20GB.

Related to the solution, you are free to make a PR yourself. I haven't tested the integration with tqdm. There is a MultipartEncoderMonitor also which accepts a MultipartEncoder as the first argument and a callback of the form def callback(monitor: MultipartEncoderMonitor) as the second argument and this callback is called every time the read method is called. So probably this should replace the CallbackIOWrapper by creating a callback function that uses tqdm_handler.update internally.

By the way, I found that passing the file handler directly to the encoder doesn't seem to work with anonfiles, but using the tuple version seems to work:

MultipartEncoder(fields={'file': (file_name, file_handler)}),
hentai-chan commented 3 years ago

Thanks for the suggestion, I will submit a PR tomorrow, something urgent just came up that requires my attention now.

hentai-chan commented 3 years ago

I think I am close to a tqdm-compatible solution, but it's not quite right yet:

@authenticated
def upload(self, path: str, progressbar: bool=False) -> ParseResponse:
    size = os.stat(path).st_size
    options = AnonFile.__progressbar_options(None, f"Upload: {Path(path).name}", unit='B', total=size, disable=progressbar)
    with open(path, mode='rb') as file_handler:
         encoder = MultipartEncoder(fields={'file': (Path(path).name, file_handler)})
         with tqdm(**options) as tqdm_handler:
             encoder_monitor = MultipartEncoderMonitor(encoder, callback=lambda monitor: tqdm_handler.update(monitor.bytes_read - tqdm_handler.n))
             response = self.session.post(
                 urljoin(AnonFile.API, 'upload'),
                 data=encoder_monitor,
                 params={'token': self.token},
                 headers={'ContentType': encoder_monitor.content_type},
                 timeout=self.timeout,
                 proxies=getproxies(),
                 verify=True
             )

            return ParseResponse(response, Path(path))

That's the current error message:

Upload: LICENSE: |                                                                    | 1.24k/? [00:00<00:00, 7.02kB/s]
400 Client Error: Bad Request for url: https://api.anonfiles.com/upload?token=secret
'NoneType' object has no attribute 'url'