Azure / azure-cosmos-js

@azure/cosmos has moved to a new repo https://github.com/Azure/azure-sdk-for-js
MIT License
212 stars 59 forks source link

SDK not respecting default account consistency #159

Closed andijakl closed 5 years ago

andijakl commented 5 years ago

Summary:

Creating an item using the create() method and then immediately retrieving it using an SQL statement works 90% of the times. But in the other 10%, the query doesn't return the item - even though it was indeed created.

Steps to reproduce:

For full details & the code, see: Article Code sample Note: these don't mention the error, as they are part of materials I created for a University course I'm teaching.

The relevant code snippet:

`const newItemId = Math.floor(Math.random() * 1000 + 10).toString();
let documentDefinition = { "id": newItemId, "name": "Angus MacGyver", "state": "Building stuff" };

// Add a new item to the container const createResponse = await container.items.create(documentDefinition); console.log(createResponse.body);

// Execute SQL query to retrieve the new item const querySpec = { query: "SELECT * FROM c WHERE c.id=@id", parameters: [ { name: "@id", value: newItemId } ] }; const queryResponse = await container.items.query(querySpec).toArray(); console.log(queryResponse.result[0].name);`

In most cases, the SQL query returns the item. From time to time, the item isn't returned, though. However, looking at the Azure Data Explorer, it was indeed created. The response of the Cosmos DB doesn't indicate any error:

{ result: [], headers: { 'cache-control': 'no-store, no-cache', pragma: 'no-cache', 'transfer-encoding': 'chunked', 'content-type': 'application/json', server: 'Microsoft-HTTPAPI/2.0', 'strict-transport-security': 'max-age=31536000', 'x-ms-last-state-change-utc': 'Thu, 11 Oct 2018 01:27:52.965 GMT', 'x-ms-resource-quota': 'documentSize=10240;documentsSize=10485760;documentsCount=-1;collectionSize=10485760;', 'x-ms-resource-usage': 'documentSize=0;documentsSize=0;documentsCount=14;collectionSize=0;', lsn: '145', 'x-ms-item-count': '0', 'x-ms-schemaversion': '1.6', 'x-ms-alt-content-path': 'dbs/ToDoList/colls/Items', 'x-ms-content-path': 'fvlLALM+HAA=', 'x-ms-xp-role': '2', 'x-ms-global-committed-lsn': '145', 'x-ms-number-of-read-regions': '1', 'x-ms-transport-request-id': '150741', 'x-ms-cosmos-llsn': '145', 'x-ms-session-token': '0:-1#145', 'x-ms-request-charge': '1.97', 'x-ms-serviceversion': 'version=2.1.0.0', 'x-ms-activity-id': '2b20d905-3801-4248-9852-95ebcca08d08', 'x-ms-gatewayversion': 'version=2.1.0.0', date: 'Thu, 11 Oct 2018 13:01:22 GMT', 'x-ms-throttle-retry-count': 0, 'x-ms-throttle-retry-wait-time-ms': 0 } }

Now, I suspect the error could be related to some throughput / throttling issue. However, it's then strange that the code works most of the time and fails at random intervals, without the API informing me about an issue. Also, if the JavaScript API to create the item returns with success and all the item details (including _rid, _self, ...), I'd expect the item to be retrievable.

Searched the documentation for any hints or best practices, but didn't find any. Inserting a sleep statement into the Node.js code isn't really a great work-around either.

Therefore, reporting this as a potential issue, as the API returns that the item was created, but it's not (yet).

Environment

Cosmos DB JS library version: 2.0.2 Node.js version: 8.12.0 (latest LTS)

southpolesteve commented 5 years ago

@andijakl Could you see if you observe the same results with a session level consistency? You can set it during client construction:

const client = new CosmosClient({
  endpoint,
  auth: { masterKey },
  consistencyLevel: "Session"
});

Edit: Removed incorrect comment about default consistency

tony-gutierrez commented 5 years ago

What was the default consistency of the original SDK? We would not want to change by implementing this one without understanding what is changing.

andijakl commented 5 years ago

Thanks for the tip! Actually, it did seem to help and I couldn't reproduce the issue anymore with the consistency level set to "session".

However, I still can't explain the behavior.

The default consistency in my Cosmos DB in the Azure Account is already set to "Session", so I'd assume the SDK doesn't silently overwrite these settings? I do have geo-redundancy enabled with two regions. But as both requests are done in a single session, I was surprised that "read-your-write" didn't work.

If I don't explicitly specify the consistency level when constructing the CosmosClient in Node.js, the SDK returns "undefined" after I construct the client: console.log("Consistency level: " + cosmosClient.options.consistencyLevel); That would hint at the SDK not setting its own consistency level and keeping the "session" from the Azure default configuration. I couldn't analyze what the Node.js app then actually sends to the Azure server with the request, though.

southpolesteve commented 5 years ago

@tony-gutierrez There are no changes to this behavior between the old SDK and the new one. I was mistaken in the above comment. The SDK should use your account level consistency setting.

@andijakl Thanks for the additional info. I agree that the behavior still sounds strange. I'll do some investigating. One additional Q: What is your indexing policy set too? I wonder if you'll get the same results doing a read instead of a query.

southpolesteve commented 5 years ago

@andijakl I'm able to replicate this bug. We're not respecting the default account consistency setting in the SDK. The workaround is to set the consistency explicitly as you did above. We'll look into a fix. I changed the title to reflect the underlying issue.

andijakl commented 5 years ago

The indexing policy for the collection is set to the following in the Azure portal: "indexingMode": "consistent"

Retrieving the item through a read call instead of a query results in the same behavior: const { body } = await container.item(newItemId).read();

Great to hear you could reproduce the issue!

christopheranderson commented 5 years ago

I'll pick this up. @southpolesteve, catch me up on this issue on Teams/whatever.

github-actions[bot] commented 5 years ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days