googleapis / google-cloud-python

Google Cloud Client Library for Python
https://googleapis.github.io/google-cloud-python/
Apache License 2.0
4.83k stars 1.53k forks source link

Client library should not fail when encountering unexpected response bodies #2930

Closed gguuss closed 7 years ago

gguuss commented 7 years ago
  1. OS type and version OSX 10.11

  2. Python version and virtual environment information python --version Python 2.7.12

  3. google-cloud-python version pip show google-cloud, pip show google-<service> or pip freeze google-cloud-vision==0.22.0

  4. Stacktrace if available

cls = <class 'google.cloud.vision.annotations.Annotations'> response = {'fullTextAnnotation': {'pages': [{'blocks': [{'blockType': 'TEXT', 'boundingBox': {...}, 'paragraphs': [...], 'proper... [{'x': 317, 'y': 506}, {'x': 337, 'y': 507}, {'x': 335, 'y': 536}, {'x': 315, 'y': 535}]}, 'description': 'to'}, ...]}

@classmethod
def from_api_repr(cls, response):
    """Factory: construct an instance of ``Annotations`` from a response.

        :type response: dict
        :param response: Vision API response object.

        :rtype: :class:`~google.cloud.vision.annotations.Annotations`
        :returns: An instance of ``Annotations`` with detection types loaded.
        """
    annotations = {}
    for feature_type, annotation in response.items():
      curr_feature = annotations.setdefault(_KEY_MAP[feature_type], [])

E KeyError: u'fullTextAnnotation'

env/lib/python2.7/site-packages/google/cloud/vision/annotations.py:91: KeyError


5. Steps to reproduce
* Enable API backend with extra data in response to detect_text
* Call detect_text
* Extra content in response body crashes the client library

6. Code example

The following code exercises the crash:

def detect_text(path): """Detects text in the file.""" vision_client = vision.Client()

with io.open(path, 'rb') as image_file:
    content = image_file.read()

image = vision_client.image(content=content)

texts = image.detect_text()
print('Texts:')
for text in texts:
    print(text.description)
daspecster commented 7 years ago

I can start to fix this by changing _KEY_MAP[feature_type] to _KEY_MAP.get(feature_type), but there are fairly large implications.

This change will just ignore that annotation type for now. Is that the right path to take?

WDYT @dhermes @tseaver?

dhermes commented 7 years ago

@daspecster I don't know enough about the API to say. Are these features we don't support?

theacodes commented 7 years ago

In general the API should be able to add additional fields to responses without the client library barfing.

daspecster commented 7 years ago

@jonparrott should the library just return nothing in those cases?

In this case @gguuss is having fullTextAnnotation returned, I presume, along with textAnnotations? There could also be faceAnnotations or whatever else was requested.

fulltextAnnotation isn't documented anywhere that I could find.

dhermes commented 7 years ago

In general the API should be able to add additional fields to responses without the client library barfing.

Is that true? Shouldn't the response be well-defined?

gguuss commented 7 years ago

fulltextAnnotation isn't documented anywhere that I could find. I don't know enough about the API to say. Are these features we don't support?

This is not part of the public API response, let's consider it an experiment.

Is that true? Shouldn't the response be well-defined?

Yes, but changes in the response break the client so if someone doesn't update their client when changes are made to the API, their system will break. This seems specific to the Python client. Anecdotally, the Node client is not adversely affected by changes in the response body.

daspecster commented 7 years ago

@gguuss, AFAICT the only change I could make, would just ignore the result. Is that ok for what you're doing?

dhermes commented 7 years ago

I am :-1: on this. I'd guess support in node is just an artifact of the language. The API is versioned. Why should we strive to support things that aren't in the publicly documented interface of the version we are supporting?

gguuss commented 7 years ago

@daspecster Yes, I'm happy with the client library ignoring undocumented parts of the response.

@dhermes I believe versioned could be interpreted as no "breaking" changes - existing members will not be removed or renamed. I do not think that it means, "response bodies for this API will remain the same ad infinitum".

I'm not sure who the authority on this is for Google APIs, but it would not surprise me if additional fields are added to response bodies on a versioned API endpoint without changing the endpoint version.

theacodes commented 7 years ago

So, some context:

Let's say we have an API at v1 and we know one of the methods returns the following:

{
    'field1': 'meep'
}

So, let's say they decide this method should return more info. They decide to add another field to the return value. This does not necessitate a major version bump. The new response is:

{
    'field1': 'meep',
    'field2': 'moop'
}

Our argument here is that this client library should be robust against this happening. The API should be able to add additional fields without the client blowing up. We don't have the expectation that this extra data will be immediately available without an update, just that it doesn't choke on the response.

There is precedence to this. Compute has kept API v1 for years despite adding new resources, parameters, and response fields. APIs generally only bump versions when they remove or change an existing thing.

dhermes commented 7 years ago

@jonparrott SGTM. We in google-cloud-python have witnessed this happening in BigQuery, but it has been not very pleasant (i.e. the newly added "non-breaking" features have been our richest source of bug reports).

theacodes commented 7 years ago

Bigquery adding a completely new SQL dialect and result types is a pretty extreme case. :)