IBM / python-sdk-core

The python-sdk-core repository contains core functionality required by Python code generated by the IBM OpenAPI SDK Generator.
Apache License 2.0
20 stars 27 forks source link

fix: add correct support for compressing file-like objects #174

Closed pyrooka closed 11 months ago

pyrooka commented 1 year ago

This PR contains the fix for the bug in the request preparation step. Previously file-like object couldn't be compressed and the code threw and exception if the user tried to do it. The solution that can be found in this PR, is a helper class called GzipStream which act like a middle item between the reader and the writer and compresses the data on the fly. More details can be found in the comments.

pyrooka commented 12 months ago

I have updated the PR based on @padamstx's suggestion. Let me know what you think, I am open to change it you have major concerns or better ideas!

ricellis commented 12 months ago

do you by chance have a scenario that might be a good test prior to merging?

I'll try and run it through our suites tomorrow.

ricellis commented 12 months ago

No good I'm afraid, 167 failures, I didn't check each one individually, but they seem to be of two types. Firstly,

[2023-09-27T11:11:00.357Z]     response = self.send(request, **kwargs)
[2023-09-27T11:11:00.357Z] ../../../pythonvenv/lib64/python3.11/site-packages/ibm_cloud_sdk_core/base_service.py:313: in send
[2023-09-27T11:11:00.357Z]     response = self.http_client.request(**request, cookies=self.jar, **kwargs)
[2023-09-27T11:11:00.357Z] ../../../pythonvenv/lib64/python3.11/site-packages/requests/sessions.py:589: in request
[2023-09-27T11:11:00.357Z]     resp = self.send(prep, **send_kwargs)
[2023-09-27T11:11:00.357Z] ../../../pythonvenv/lib64/python3.11/site-packages/requests/sessions.py:703: in send
[2023-09-27T11:11:00.357Z]     r = adapter.send(request, **kwargs)
[2023-09-27T11:11:00.357Z] ../../../pythonvenv/lib64/python3.11/site-packages/requests/adapters.py:486: in send
[2023-09-27T11:11:00.357Z]     resp = conn.urlopen(
[2023-09-27T11:11:00.357Z] ../../../pythonvenv/lib64/python3.11/site-packages/urllib3/connectionpool.py:714: in urlopen
[2023-09-27T11:11:00.357Z]     httplib_response = self._make_request(
[2023-09-27T11:11:00.357Z] ../../../pythonvenv/lib64/python3.11/site-packages/urllib3/connectionpool.py:413: in _make_request
[2023-09-27T11:11:00.357Z]     conn.request_chunked(method, url, **httplib_request_kw)
[2023-09-27T11:11:00.357Z] ../../../pythonvenv/lib64/python3.11/site-packages/urllib3/connection.py:270: in request_chunked
[2023-09-27T11:11:00.357Z]     for chunk in body:
[2023-09-27T11:11:00.357Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[2023-09-27T11:11:00.357Z] 
[2023-09-27T11:11:00.357Z] self = <ibm_cloud_sdk_core.utils.GzipStream object at 0x7f4cf581eb60>, size = 1
[2023-09-27T11:11:00.358Z] 
[2023-09-27T11:11:00.358Z]     def read(self, size: int = -1):
[2023-09-27T11:11:00.358Z]         """Compresses and returns the requested size of data.
[2023-09-27T11:11:00.358Z]     
[2023-09-27T11:11:00.358Z]         Args:
[2023-09-27T11:11:00.358Z]             size: how many bytes to return. -1 to read and compress the whole file
[2023-09-27T11:11:00.358Z]         """
[2023-09-27T11:11:00.358Z]         if (size < 0) or (len(self.buffer) < size):
[2023-09-27T11:11:00.358Z]             for raw in self.uncompressed:
[2023-09-27T11:11:00.358Z]                 # We need to encode text like streams (e.g. TextIOWrapper) to bytes.
[2023-09-27T11:11:00.358Z]                 if isinstance(raw, str):
[2023-09-27T11:11:00.358Z]                     raw = raw.encode()
[2023-09-27T11:11:00.358Z]     
[2023-09-27T11:11:00.358Z]                 self.compressor.write(raw)
[2023-09-27T11:11:00.358Z]     
[2023-09-27T11:11:00.358Z]                 # Stop compressing if we reached the max allowed size.
[2023-09-27T11:11:00.358Z]                 if 0 < size < len(self.buffer):
[2023-09-27T11:11:00.358Z]                     self.compressor.flush()
[2023-09-27T11:11:00.358Z]                     break
[2023-09-27T11:11:00.358Z]             else:
[2023-09-27T11:11:00.358Z]                 self.compressor.close()
[2023-09-27T11:11:00.358Z]     
[2023-09-27T11:11:00.358Z]             if size < 0:
[2023-09-27T11:11:00.358Z]                 # Return all data from the buffer.
[2023-09-27T11:11:00.358Z]                 compressed = self.buffer
[2023-09-27T11:11:00.358Z]                 self.buffer = b''
[2023-09-27T11:11:00.358Z]         else:
[2023-09-27T11:11:00.358Z]             # If we already have enough data in our buffer
[2023-09-27T11:11:00.358Z]             # return the desired chunk of bytes
[2023-09-27T11:11:00.358Z]             compressed = self.buffer[:size]
[2023-09-27T11:11:00.359Z]             # then remove them from the buffer.
[2023-09-27T11:11:00.359Z]             self.buffer = self.buffer[size:]
[2023-09-27T11:11:00.359Z]     
[2023-09-27T11:11:00.359Z] >       return compressed
[2023-09-27T11:11:00.359Z] E       UnboundLocalError: cannot access local variable 'compressed' where it is not associated with a value

and secondly:

[2023-09-27T11:11:01.043Z] ../../../pythonvenv/lib64/python3.11/site-packages/responses/__init__.py:229: in wrapper
[2023-09-27T11:11:01.043Z]     return func(*args, **kwargs)
[2023-09-27T11:11:01.043Z] test/unit/test_cloudant_v1.py:7932: in test_post_explain_all_params
[2023-09-27T11:11:01.043Z]     responses.calls[0].request.body = gzip.decompress(responses.calls[0].request.body)
[2023-09-27T11:11:01.043Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[2023-09-27T11:11:01.043Z] 
[2023-09-27T11:11:01.043Z] data = <ibm_cloud_sdk_core.utils.GzipStream object at 0x7f4cf2b7c9d0>
[2023-09-27T11:11:01.043Z] 
[2023-09-27T11:11:01.043Z]     def decompress(data):
[2023-09-27T11:11:01.043Z]         """Decompress a gzip compressed string in one shot.
[2023-09-27T11:11:01.043Z]         Return the decompressed string.
[2023-09-27T11:11:01.043Z]         """
[2023-09-27T11:11:01.043Z]         decompressed_members = []
[2023-09-27T11:11:01.043Z]         while True:
[2023-09-27T11:11:01.043Z] >           fp = io.BytesIO(data)
[2023-09-27T11:11:01.043Z] E           TypeError: a bytes-like object is required, not 'GzipStream'
[2023-09-27T11:11:01.044Z] 
[2023-09-27T11:11:01.044Z] /usr/lib64/python3.11/gzip.py:600: TypeError
ricellis commented 12 months ago

FWIW the first is also what I saw testing locally when I tried to validate this change solved the issue reported in https://github.com/IBM/cloudant-python-sdk/issues/554. At first glance the second might be related to the generated test code's expectations about the request body.

pyrooka commented 12 months ago

Thanks a lot @ricellis, I will take a look! At first glance these issues should be reproducible in unit tests, but we'll see.

pyrooka commented 11 months ago

@ricellis Could you do another test run when you have some time? I've fixed the issues I found, and tweaked the generated unit tests to handle GzipStream bodies - so please use the generator, built from the main branch. (I've tested the changes with all our APIs and found no issues, hopefully you will get the same result.)

ricellis commented 11 months ago

Re-tested with new generated test fixes, all passing now. Thanks!

pyrooka commented 11 months ago

@ricellis Thanks for the testing (and finding the bugs)!

ibm-devx-sdk commented 11 months ago

:tada: This PR is included in version 3.17.1 :tada:

The release is available on GitHub release

Your semantic-release bot :package::rocket: