googleapis / python-vision

This library has moved to https://github.com/googleapis/google-cloud-python/tree/main/packages/google-cloud-vision
Apache License 2.0
161 stars 85 forks source link

How to serialize to json or dict that response from vision api? #70

Closed inska closed 4 years ago

inska commented 7 years ago

Hi,

Google Cloud SDK works well to get annotation responses via vision API, and I just like to know how can I serialize the response. Is there any helper or serializer for Google response objects? Otherwise, should I parse and build dictionaries from the response object by manual?

  1. OS type and version OS X El Capitan
  2. Python version and virtual environment information python --version 2.7.13
  3. google-cloud-python version pip show google-cloud, pip show google-<service> or pip freeze 157.0.0
  4. Stacktrace if available
  5. Steps to reproduce
  6. Code example
dhermes commented 7 years ago

@inska Can you provide a code snippet that produces the thing you'd like to serialize?

lukesneeringer commented 7 years ago

@inska Sadly, the objects in the current Vision library lack serialization functions (although this is a good idea).

It is worth noting that we are about to release a substantially different library for Vision (it is on master of this repo now, although not released to PyPI yet) where this will be possible. Note that it is a backwards-incompatible upgrade, so there will be some (hopefully not too much) conversion effort.

That library returns plain protobuf objects, which can be serialized to JSON using:

from google.protobuf.json_format import MessageToJson
serialized = MessageToJson(original)

You can also go to dictionaries using something like protobuf3-to-dict.

P. S. I am going to go ahead and close this issue, since I hope I have answered it, but please feel free to reopen if needed!

arycloud commented 6 years ago

Hi @lukesneeringer , I have tried it but it return an error like: AttributeError:'google.protobuf.pyext._message.RepeatedCompositeCo' object has no attribute 'DESCRIPTOR' How can I resolve this error?

u2takey commented 6 years ago

@lukesneeringer same problem with @arycloud AttributeError:'google.protobuf.pyext._message.RepeatedCompositeCo' object has no attribute 'DESCRIPTOR'

rootVIII commented 6 years ago

Thanks @lukesneeringer

I can't believe how many google searches it took me to find the answer you posted.

zyfang commented 6 years ago

@arycloud @u2takey I had the same problem as you and I realized it's because I "unpack" the response before trying to convert it to json or dict. For example: response = client.annotate_image({'image': {'source': {'image_uri': image_url}}, 'features': features, }) My program did the equivalent of: MessageToJson(response.label_annotation) , which results in AttributeError:'google.protobuf.pyext._message.RepeatedCompositeCo' object has no attribute 'DESCRIPTOR'

Now I do:

response = MessageToDict(response, preserving_proto_field_name = True)
desired_res = response["label_annotation"]
cybercser commented 5 years ago

Thanks @lukesneeringer I was befuddled by the same problem when working with Google Cloud Speech-to-Text API. I tried to serialize google.cloud.speech.v1.SpeechRecognitionResult, but only to find

TypeError: Object of type RepeatedCompositeFieldContainer is not JSON serializable.

I tried @lukesneeringer 's solution. It works!

digglife commented 5 years ago

Cool. Thank you @lukesneeringer !

henrihe1 commented 4 years ago

Thanks @lukesneeringer

I can't believe how many google searches it took me to find the answer you posted.

Thanks a lot @lukesneeringer ! @rootVIII: I can sooo relate.

piratezoro commented 4 years ago

Hi @lukesneeringer I was actually using ImageAnnotatorClient with dask for parallel computation. Until now I was just making new ImageAnnotatorClient for each image which is cause memory overflow issue. So I was planning on creating a single instance of ImageAnnotatorClient object and getting all the work done.But with dask client.map method I am unable to serialize the ImageAnnotatorClient object.So is there a way I can serialize it and the deserialize it. I tried the above method but it does not work.

akshat-khare commented 4 years ago

It is not working again. I am still facing this Descriptor error.

software-dov commented 4 years ago

All the python client libraries have been given their own repositories and no longer live under google-cloud-python. Transferring to Vision API repo.

There has been another backwards incompatible change with the Vision client library. All message types are now defined using proto-plus, which uses different methods for serialization and deserialization.

In order to get json from a message, do the following:

my_message = MyMessageType(attribute=value)

json_string = MyMessageType.to_json(my_message)
# Also works
json_string = type(my_message).to_json(my_message)
# Also also works
import proto
json_string = proto.Message.to_json(my_message)

This is not a method on the instance but instead a class method of the metaclass.

zyfang commented 4 years ago

@software-dov where does MyMessageType come from?

uditmindf007 commented 4 years ago

import proto json_string = proto.Message.to_json(my_message) worked for me, thanks @software-dov

software-dov commented 4 years ago

@zyfang It's an example type from one of the Cloud APIs. Let's tie the example back into Vision:

from google.cloud.vision import AnnotateFileRequest

request = AnnotateFileRequest()
# Do things with the fields in request.
json_string = AnnotateFileRequest.to_json(request)
jaihonikhil commented 3 years ago

I am still getting the error even after following all the above commands. Isn't there any way to extract components from "google.protobuf.pyext._message.RepeatedCompositeCo" object?

WGribaa commented 3 years ago

@jaihonikhil Once you get your response, you can get the wanted attribute, iterate over it and use the method MessageToDict on each of the elements.

It seems we now have to access the elements via their attribute "_pb" to make the method work. Here is an example :

from google.protobuf.json_format import MessageToDict
response = my_google_vision_client.label_detection(image=my_image)
tags = response.label_annotations
serializable_tags = [MessageToDict(tag._pb) for tag in tags]
software-dov commented 3 years ago

The Proto Plus documentation describes this in a more idiomatic way: https://proto-plus-python.readthedocs.io/en/stable/messages.html#serialization The following should work:

import proto
response = client.label_detection(image=my_image)
serializable_tags = [proto.Message.to_dict(tag) for tag in response.label_annotations]
wenbinf commented 3 years ago

Just in case 1) you come to this closed issue after Nov 30, 2021, and 2) try out all code snippets that you can find on the internet but still couldn't get it work...

This works for me:


from google.cloud import speech_v1 as speech

config = ...
audio = ...
client = speech.SpeechClient()
operation = client.long_running_recognize(config=config, audio=audio)
op_result = operation.result()

##############
# Okay, fun part - 
##############
result_in_dict =  json.loads(type(op_result).to_json(op_result))
Morgan-Gicheha commented 2 years ago

This works for me

import proto
response = client.label_detection(image=my_image)
serializable_tags = [proto.Message.to_dict(tag) for tag in response.label_annotations]
AngelDimov commented 1 year ago

I recently ran into the same issue for the Google Ads API but I used the MessageToDict method instead and extended it using a generator:

from google.protobuf.json_format import MessageToDict

def get_data():
        response = # Make request to API
        for row in response.results:
                yield MessageToDict(row)

for data in get_data():
        # Do something with the data                

I find it a lot easier to work with the records in a dictionary format further down the line.