Azure / azure-cosmosdb-node

We recently announced deprecation of JS v1 SDK and this repo. Starting September 2020 Microsoft will not provide support for this library. Existing applications using library will continue to work as-is. We strongly recommend upgrading to @azure/cosmos library.
https://github.com/Azure/azure-sdk-for-js
MIT License
141 stars 107 forks source link

Support for Direct mode #78

Closed xtazz closed 5 years ago

xtazz commented 8 years ago

It would be nice for DocumentDB Node.js SDK to support Direct mode via TCP to speed up communication with the DB. Thank you!

ghost commented 8 years ago

This is planned in a future version

danbucholtz commented 8 years ago

Any updates on this? We are seeing very poor performance from Document DB when used via Node.

ghost commented 8 years ago

We have started work on enabling this. Don't have a committed date as yet though.

Could you elaborate on "very poor performance". Is it only when using Node? What are the performance tiers for your collections? Are all operations performing really badly, or just some? And the obvious question, is your application deployed in Azure? in the same region as the database?

Jacob-McKay commented 8 years ago

I work with @danbucholtz and can answer to how we're using documentDb.

The main killer we have is that it takes upwards of 4 seconds for some web requests to make it through our web app api to docDb and back. The web calls that take this long are ones where we are retrieving full document lists, like "select * from root". For instance it took 4.5 seconds to retrieve 1200 records (all the records in one of our collections).

Our collections are S2 Operations that return a large dataset are the ones I would say are performing badly Our application is deployed via Azure Web App service. Previously, the DocDb and Web App were NOT in the same region. Since moving them to the same region, we have seen significant performance improvements (it took our ~4s calls to ~3s calls).

We prototyped out a .NET client making the same calls via the direct mode (TCP, not HTTP) and saw further improvements in performance, which is why we would like to see TCP in for our node.js web app ass soon as possible.

We also fiddled around with the page size in the documentDb client, increasing it to 1000 gave us a slight performance increase as well.

I have also noticed that queries like "select r.id from root" perform much better than "select * from root" so that is another tweak I'm thinking about making to our application, as we typically retrieve and act on whole documents for flexibility

danbucholtz commented 8 years ago

Any update/timeline on this?

ghost commented 8 years ago

Nothing as yet, no. I'd still like to explore under what circumstances you believe the node code is slower than .net.

coreybutler commented 7 years ago

I'm seeing this too.

I wrote a generic wrapper with some benchmarks thrown in:

Code:

``` js 'use strict' // Create the DocumentDB client const DocumentDbClient = require('documentdb').DocumentClient class DatabaseClient extends NGN.EventEmitter { constructor (config) { super() let me = this // Basic DocumentDB configuration Object.defineProperties(this, { database: NGN.const(config.database), collection: NGN.const(config.collection), dbclient: NGN.privateconst(new DocumentDbClient(config.host, { masterKey: config.key })), rawDatabaseLink: NGN.private('dbs/' + config.database), rawCollectionLink: NGN.private('dbs/' + config.database + '/' + config.collection) }) // Retrieve DB Connection Data let start = process.hrtime() this.dbclient.queryDatabases({ query: 'SELECT * FROM root r WHERE r.id=@id', parameters: [{ name: '@id', value: this.database }] }).toArray((err, results) => { let diff = process.hrtime(start) let queryDuration = ((diff[0] * 1e9 + diff[1])/1000000000) start = process.hrtime() console.log('Query took', queryDuration, 'seconds') this.rawDatabaseLink = results[0]._self this.dbclient.readCollections(this.rawDatabaseLink).toArray((err, collections) => { let diff = process.hrtime(start) let opDuration = ((diff[0] * 1e9 + diff[1])/1000000000) console.log('Operation took', opDuration, 'seconds') this.rawCollectionLink = collections[0]._self this.emit('ready') }) }) } get databaseLink () { return this.rawDatabaseLink } get collectionLink () { return this.rawCollectionLink } get client () { return this.dbclient } create () { } } module.exports = DatabaseClient ```

Response:

image

"Query" refers to the time elapsed while running this method:

this.dbclient.queryDatabases({
      query: 'SELECT * FROM root r WHERE r.id=@id',
      parameters: [{
        name: '@id',
        value: this.database
      }]
    })...

"Operation" refers to the time elapsed to run this method:

this.dbclient.readCollections(this.rawDatabaseLink)....

This is a brand new DocumentDB setup with a single collection and no documents.

I'm running with with Node 6.9.1 on a minimal Alpine Linux instance (in a Docker container). I've also used the exact same environment with an old remote MongoDB (hosted on mlab). The Mongo instance has sub-second responses. I have no reason to believe there is anything interfering with the network (I get the same results when running this container in an Azure VM).

I'm just testing DocumentDB at this point, and only with Node.

ralphtheninja commented 7 years ago

Operations that return a large dataset are the ones I would say are performing badly

I have also noticed the same problems with collections containing 1000+ documents. At first I thought .toArray() took a long time since it's buffering the content, but if I do the same query using .nextItem() to avoid buffering the data comes straight away but in total it still takes the same time to get all the elements.

ralphtheninja commented 7 years ago

I'm running with with Node 6.9.1 on a minimal Alpine Linux instance (in a Docker container). I've also used the exact same environment with an old remote MongoDB (hosted on mlab). The Mongo instance has sub-second responses. I have no reason to believe there is anything interfering with the network (I get the same results when running this container in an Azure VM).

I'm running a similar setup. Using node 6.8.0 and docker containers.

ralphtheninja commented 7 years ago

As a side note. I've played around with consistency levels. Changing from "Strong" to "Session" improved our read operations quite a bit. It took down latency by at least 50% and made a 3x improvement on requests per second to our api. (Used https://github.com/mcollina/autocannon#readme for this)

ddolheguy commented 7 years ago

Do we have any planned date on this? It's a pretty important feature and let's just say DocumentDB isn't cheap so can we please get some kind of timeline on when this will be available?

moderakh commented 7 years ago

@ddolheguy we don't have a fixed plan for supporting direct connectivity in our nodejs sdk.

Our .NET and Java SDKs have support for direct connectivity and can be used for better performance.

Also please try to avoid using .toArray() if you know there are a large number of matching results. toArray(.) will retrieve all result and buffer in memory and so is slow and also may put pressure on memory. The right approach is to either use nextItem(.) (or executeNext(.)) which retrieve the result in a streaming manner. Using nextItem(.) (or executeNext(.)) should improve your apps's responsiveness.

DDeme commented 6 years ago

Can this be done by community ? It´s hard to not have advertised and paid feature. Where should we start ?

lilyS123 commented 6 years ago

Is there any limitation to not supporting direct connectivity in Node SDK? Why is it not being added to Node SDK well? Our application interactions with document DB have been coded using Node SDK and asking us to rewrite in Java or .Net is not a reasonable ask given that Azure advertises Node.js support as well.

tony-gutierrez commented 6 years ago

MS doesnt put any effort into making their Node SDKs have feature parity with .NET. It's an old dinosaur company.

chhantyal commented 6 years ago

Has anyone tried concurrent queries? DocumentDB is joke when it comes to performance.

southpolesteve commented 5 years ago

We have recently announced deprecation version 1.x of the Azure Cosmos JavaScript SDK. We will end support for the documentdb package and this repo on August 30, 2020. Please update to our new package @azure/cosmos as soon as possible. If you encounter any issues, you can raise them in the Azure central JS SDK repo. If something is preventing you from upgrading to the latest version of the SDK, you can always email me directly: stfaul@microsoft.com