Invalid ResourceId's, cannot use with Java SDK

badeball commented 5 years ago

Hello, members of Zeit

With the official CosmosDB emulator not running on non-windows, this is more or less exactly what I'm looking for. I was however not able to use it together with the Java SDK.

It seems like the Node SDK is quite lenient and performs little to no validation on the resource ids*. Hence, generating RID's like you do probably work (I haven't actually tested your implementation with the Node SDK, but I assume it does given the readme etc).

The Java SDK however is a bit more strict and assigns much more meaning to them, as shown here. In addition to be accepted by the parser, they also need to successfully serialize back to the original string.

Since your implementation only generates a quite random ID (of 14 bytes?), the Java SDK will fail on inserting documents, as shown below.

java.lang.IllegalArgumentException: Invalid resource id PEtvD0--C30lC30lC30lC0--
    at com.microsoft.azure.documentdb.internal.ResourceId.parse(ResourceId.java:57)
    at com.microsoft.azure.documentdb.internal.routing.CollectionCache.resolveByRid(CollectionCache.java:116)
    at com.microsoft.azure.documentdb.internal.routing.CollectionCache.resolveCollection(CollectionCache.java:49)
    at com.microsoft.azure.documentdb.internal.SessionContainer.resolvePartitionKeyRange(SessionContainer.java:217)
    at com.microsoft.azure.documentdb.internal.SessionContainer.resolveSessionToken(SessionContainer.java:123)
    at com.microsoft.azure.documentdb.DocumentClient.applySessionToken(DocumentClient.java:3222)
    at com.microsoft.azure.documentdb.DocumentClient.doQuery(DocumentClient.java:3143)
    at com.microsoft.azure.documentdb.DocumentQueryClientInternal.doQuery(DocumentQueryClientInternal.java:47)
    at com.microsoft.azure.documentdb.internal.query.AbstractQueryExecutionContext.executeRequest(AbstractQueryExecutionContext.java:219)
    at com.microsoft.azure.documentdb.internal.query.DefaultQueryExecutionContext.executeOnce(DefaultQueryExecutionContext.java:159)
    at com.microsoft.azure.documentdb.internal.query.DefaultQueryExecutionContext.fillBuffer(DefaultQueryExecutionContext.java:99)
    at com.microsoft.azure.documentdb.internal.query.DefaultQueryExecutionContext.hasNext(DefaultQueryExecutionContext.java:71)
    at com.microsoft.azure.documentdb.internal.query.ProxyQueryExecutionContext.<init>(ProxyQueryExecutionContext.java:67)
    at com.microsoft.azure.documentdb.internal.query.QueryExecutionContextFactory.createQueryExecutionContext(QueryExecutionContextFactory.java:23)
    at com.microsoft.azure.documentdb.QueryIterable.createQueryExecutionContext(QueryIterable.java:70)
    at com.microsoft.azure.documentdb.QueryIterable.reset(QueryIterable.java:115)
    at com.microsoft.azure.documentdb.QueryIterable.<init>(QueryIterable.java:57)
    at com.microsoft.azure.documentdb.DocumentClient.queryDocuments(DocumentClient.java:1167)
    at com.microsoft.azure.documentdb.DocumentClient.queryDocuments(DocumentClient.java:1138)

* The source linked is of a quite old version. I can't seem to find the source for the most recent version of the Java SDK anywhere, reported here. This is obviously quite unfortunate, who knows what the SDK is doing these days.

badeball commented 5 years ago

I found that there's another Java SDK, where source is provided and the RID parse implementation is similar: https://github.com/Azure/azure-cosmosdb-java/blob/v2.6.0/commons/src/main/java/com/microsoft/azure/cosmosdb/internal/ResourceId.java.

nkzawa commented 5 years ago

Thanks for the report! Actually, we just create random strings with a few rules for now, since I was not sure how they should be constructed. The tryParse code is very informative and we will be able to fix id generation according to it.

southpolesteve commented 5 years ago

@nkzawa FYI there is a JS version of this file as well which may help https://github.com/Azure/azure-cosmos-js/blob/c1be5cc0fcc80670ae6c2a6b5f133530e7889db9/src/common/resourceId.ts

In practice, the JS SDK doesn't use any of this information which is why it was later removed

nkzawa commented 5 years ago

@southpolesteve It seems we can almost just copy and apply the file, thank you.

Btw, do you know how exactly each "id" is assigned on database? As far as I observed, _rid is incremented as base64 when new documents are created like the following.

1: "wkc8APV7cAAFAAAAAAAAAA=="
2: "wkc8APV7cAAGAAAAAAAAAA==" // "F" to "G"
3: "wkc8APV7cAAHAAAAAAAAAA==" // "G" to "H"
...

If you increment document id as an integer, it doesn't become like that 🤔


const id = new ResouceId();
id.database = "1";
id.documentCollection = "1" 
id.document = "1";
console.log(id.toString()); // 'AAAAAQAAAAEAAAAAAAAAAQ=='

// for the next document (wrong)
id.document = "2";
console.log(id.toString());  // 'AAAAAQAAAAEAAAAAAAAAAg=='

southpolesteve commented 5 years ago

I am not sure but I think you'll be fine with how you are doing it in the PR. The ResourceId just cares about the structure to determine the type of resource it is talking to. I don't think you need to worry about the exact behavior of writing new docs on _rid

nkzawa commented 5 years ago

@badeball sorry for the delay. The problem should be fixed on the latest version 0.2.10. Let me know it didn't work for you.

badeball commented 5 years ago

Hmm, I can't seem to perform any query using the Java SDK. It fails on the read-collection-endpoint, where it is provided with dbId=AAAAAQ== and collId=AAAAAQ==. This happens when I create the database and collection using the Node SDK and attempting to query it using the Java SDK.

Here's the stack trace, if it's of any interest.

java.lang.IllegalStateException: java.util.concurrent.ExecutionException: java.lang.IllegalStateException: com.microsoft.azure.documentdb.DocumentClientException: null, StatusCode: NotFound
    at com.microsoft.azure.documentdb.internal.routing.CollectionCache.resolveByRid(CollectionCache.java:132)
    at com.microsoft.azure.documentdb.internal.routing.CollectionCache.resolveCollection(CollectionCache.java:49)
    at com.microsoft.azure.documentdb.internal.SessionContainer.resolvePartitionKeyRange(SessionContainer.java:217)
    at com.microsoft.azure.documentdb.internal.SessionContainer.resolveSessionToken(SessionContainer.java:123)
    at com.microsoft.azure.documentdb.DocumentClient.applySessionToken(DocumentClient.java:3222)
    at com.microsoft.azure.documentdb.DocumentClient.doQuery(DocumentClient.java:3143)
    at com.microsoft.azure.documentdb.DocumentQueryClientInternal.doQuery(DocumentQueryClientInternal.java:47)
    at com.microsoft.azure.documentdb.internal.query.AbstractQueryExecutionContext.executeRequest(AbstractQueryExecutionContext.java:219)
    at com.microsoft.azure.documentdb.internal.query.DefaultQueryExecutionContext.executeOnce(DefaultQueryExecutionContext.java:159)
    at com.microsoft.azure.documentdb.internal.query.DefaultQueryExecutionContext.fillBuffer(DefaultQueryExecutionContext.java:99)
    at com.microsoft.azure.documentdb.internal.query.DefaultQueryExecutionContext.hasNext(DefaultQueryExecutionContext.java:71)
    at com.microsoft.azure.documentdb.internal.query.ProxyQueryExecutionContext.<init>(ProxyQueryExecutionContext.java:67)
    at com.microsoft.azure.documentdb.internal.query.QueryExecutionContextFactory.createQueryExecutionContext(QueryExecutionContextFactory.java:23)
    at com.microsoft.azure.documentdb.QueryIterable.createQueryExecutionContext(QueryIterable.java:70)
    at com.microsoft.azure.documentdb.QueryIterable.reset(QueryIterable.java:115)
    at com.microsoft.azure.documentdb.QueryIterable.<init>(QueryIterable.java:57)
    at com.microsoft.azure.documentdb.DocumentClient.queryDocuments(DocumentClient.java:1167)
    at com.microsoft.azure.documentdb.DocumentClient.queryDocuments(DocumentClient.java:1138)

nkzawa commented 5 years ago

@badeball Thank you again for the report. Not sure why but maybe Java client is stricter or something on other parts too. I created the issue for it.

vercel / cosmosdb-server

Invalid ResourceId's, cannot use with Java SDK #7