Closed xtazz closed 5 years ago
This is planned in a future version
Any updates on this? We are seeing very poor performance from Document DB when used via Node.
We have started work on enabling this. Don't have a committed date as yet though.
Could you elaborate on "very poor performance". Is it only when using Node? What are the performance tiers for your collections? Are all operations performing really badly, or just some? And the obvious question, is your application deployed in Azure? in the same region as the database?
I work with @danbucholtz and can answer to how we're using documentDb.
The main killer we have is that it takes upwards of 4 seconds for some web requests to make it through our web app api to docDb and back. The web calls that take this long are ones where we are retrieving full document lists, like "select * from root". For instance it took 4.5 seconds to retrieve 1200 records (all the records in one of our collections).
Our collections are S2 Operations that return a large dataset are the ones I would say are performing badly Our application is deployed via Azure Web App service. Previously, the DocDb and Web App were NOT in the same region. Since moving them to the same region, we have seen significant performance improvements (it took our ~4s calls to ~3s calls).
We prototyped out a .NET client making the same calls via the direct mode (TCP, not HTTP) and saw further improvements in performance, which is why we would like to see TCP in for our node.js web app ass soon as possible.
We also fiddled around with the page size in the documentDb client, increasing it to 1000 gave us a slight performance increase as well.
I have also noticed that queries like "select r.id from root" perform much better than "select * from root" so that is another tweak I'm thinking about making to our application, as we typically retrieve and act on whole documents for flexibility
Any update/timeline on this?
Nothing as yet, no. I'd still like to explore under what circumstances you believe the node code is slower than .net.
I'm seeing this too.
I wrote a generic wrapper with some benchmarks thrown in:
Code:
Response:
"Query" refers to the time elapsed while running this method:
this.dbclient.queryDatabases({
query: 'SELECT * FROM root r WHERE r.id=@id',
parameters: [{
name: '@id',
value: this.database
}]
})...
"Operation" refers to the time elapsed to run this method:
this.dbclient.readCollections(this.rawDatabaseLink)....
This is a brand new DocumentDB setup with a single collection and no documents.
I'm running with with Node 6.9.1 on a minimal Alpine Linux instance (in a Docker container). I've also used the exact same environment with an old remote MongoDB (hosted on mlab). The Mongo instance has sub-second responses. I have no reason to believe there is anything interfering with the network (I get the same results when running this container in an Azure VM).
I'm just testing DocumentDB at this point, and only with Node.
Operations that return a large dataset are the ones I would say are performing badly
I have also noticed the same problems with collections containing 1000+ documents. At first I thought .toArray()
took a long time since it's buffering the content, but if I do the same query using .nextItem()
to avoid buffering the data comes straight away but in total it still takes the same time to get all the elements.
I'm running with with Node 6.9.1 on a minimal Alpine Linux instance (in a Docker container). I've also used the exact same environment with an old remote MongoDB (hosted on mlab). The Mongo instance has sub-second responses. I have no reason to believe there is anything interfering with the network (I get the same results when running this container in an Azure VM).
I'm running a similar setup. Using node 6.8.0 and docker containers.
As a side note. I've played around with consistency levels. Changing from "Strong" to "Session" improved our read operations quite a bit. It took down latency by at least 50% and made a 3x improvement on requests per second to our api. (Used https://github.com/mcollina/autocannon#readme for this)
Do we have any planned date on this? It's a pretty important feature and let's just say DocumentDB isn't cheap so can we please get some kind of timeline on when this will be available?
@ddolheguy we don't have a fixed plan for supporting direct connectivity in our nodejs sdk.
Our .NET and Java SDKs have support for direct connectivity and can be used for better performance.
Also please try to avoid using .toArray() if you know there are a large number of matching results. toArray(.) will retrieve all result and buffer in memory and so is slow and also may put pressure on memory. The right approach is to either use nextItem(.) (or executeNext(.)) which retrieve the result in a streaming manner. Using nextItem(.) (or executeNext(.)) should improve your apps's responsiveness.
Can this be done by community ? It´s hard to not have advertised and paid feature. Where should we start ?
Is there any limitation to not supporting direct connectivity in Node SDK? Why is it not being added to Node SDK well? Our application interactions with document DB have been coded using Node SDK and asking us to rewrite in Java or .Net is not a reasonable ask given that Azure advertises Node.js support as well.
MS doesnt put any effort into making their Node SDKs have feature parity with .NET. It's an old dinosaur company.
Has anyone tried concurrent queries? DocumentDB is joke when it comes to performance.
We have recently announced deprecation version 1.x of the Azure Cosmos JavaScript SDK. We will end support for the documentdb
package and this repo on August 30, 2020. Please update to our new package @azure/cosmos as soon as possible. If you encounter any issues, you can raise them in the Azure central JS SDK repo. If something is preventing you from upgrading to the latest version of the SDK, you can always email me directly: stfaul@microsoft.com
It would be nice for DocumentDB Node.js SDK to support Direct mode via TCP to speed up communication with the DB. Thank you!