I get an error thrown from inside azure.cosmos.cosmos_client at line 2880:
File "...\env-3.5.4\lib\site-packages\azure\cosmos\cosmos_client.py", line 2880, in _AddPartitionKey
partitionKeyDefinition = collection.get('partitionKey')
AttributeError: 'str' object has no attribute 'get'
This error had not been encountered in previous uses of the library.
Investigation
Upon printing the value of collection just above that line, I saw that it was a JSON string, instead of a dict, as the code must have expected.
So I followed the calls backwards: each time, the raw return value of the callee was returned:
At last, in the final synchronized_request._Request function, I see at the end:
if is_media:
result = data
else:
if len(data) > 0:
try:
result = json.loads(data)
except:
raise errors.JSONParseFailure(data)
return (result, headers)
So this tells me that if is_media is set, the response is not parsed from JSON, but instead returned as-is. Injecting a little print(is_media) in there told me that the flag was indeed set, so I went up to see how it was set:
def _Request(global_endpoint_manager, request, connection_policy, requests_session, path, request_options, request_body):
"""Makes one http request using the requests module.
...
:return:
tuple of (result, headers)
:rtype:
tuple of (dict, dict)
"""
is_media = request_options['path'].find('media') > -1
(note that the return type is said to be (dict, dict), but we now know that to not always be the case)
So is_media is sniffed from request_options['path']. Intrigued, I print(request_options['path']) only to be greeted by a familiar sight:
/dbs/campaigns/colls/digitalmedia/
Conclusion
If the database id, collection id or document id (I assume) contains the substring media, then the result of the API GET request in synchronized_request._Request will not be parsed as JSON, and the azure-cosmos library will fail further down the line when expecting the returned value to be a dict rather than a str.
The obvious workaround at this point would be to just use a container with a different name. I can do this during the testing phase, but we already expect it to be called what it is, so just changing the name is not really viable. Is there a safer way of determining whether a request is_media?
The second workaround is for us to fork this library, remove the is_media check and use our fork instead (whether by putting it locally in the project or hosting it somewhere for pip to pick up). This is more effort than is desirable, but may be necessary to not be affected by this bug.
(see Conclusion for TL;DR)
Package version (pip):
azure_cosmos-3.1.2-py3-none-any.whl
Observed error
When trying to upsert a document to a collection:
I get an error thrown from inside
azure.cosmos.cosmos_client
at line 2880:This error had not been encountered in previous uses of the library.
Investigation
Upon printing the value of
collection
just above that line, I saw that it was a JSON string, instead of a dict, as the code must have expected.So I followed the calls backwards: each time, the raw return value of the callee was returned:
cosmos_client.ReadContainer(self, collection_link, options=None)
cosmos_client.Read(self, path, type, id, initial_headers, options=None)
cosmos_client.__Get(self, path, request, headers)
synchronized_request.SynchronizedRequest(client, request, global_endpoint_manager, connection_policy, requests_session, method, path, request_data, query_params, headers)
synchronized_request._Request(global_endpoint_manager, request, connection_policy, requests_session, path, request_options, request_body)
At last, in the final
synchronized_request._Request
function, I see at the end:So this tells me that if
is_media
is set, the response is not parsed from JSON, but instead returned as-is. Injecting a littleprint(is_media)
in there told me that the flag was indeed set, so I went up to see how it was set:(note that the return type is said to be
(dict, dict)
, but we now know that to not always be the case)So
is_media
is sniffed fromrequest_options['path']
. Intrigued, Iprint(request_options['path'])
only to be greeted by a familiar sight:Conclusion
If the database id, collection id or document id (I assume) contains the substring
media
, then the result of the API GET request insynchronized_request._Request
will not be parsed as JSON, and theazure-cosmos
library will fail further down the line when expecting the returned value to be adict
rather than astr
.The obvious workaround at this point would be to just use a container with a different name. I can do this during the testing phase, but we already expect it to be called what it is, so just changing the name is not really viable. Is there a safer way of determining whether a request
is_media
?The second workaround is for us to fork this library, remove the
is_media
check and use our fork instead (whether by putting it locally in the project or hosting it somewhere for pip to pick up). This is more effort than is desirable, but may be necessary to not be affected by this bug.