PouchDB: evalFunctionInVm performance issue.

enricogior commented 7 years ago

The map reduce code invokes evalFunctionInVm even for non user defined functions.

https://github.com/pouchdb/pouchdb/blob/master/packages/node_modules/pouchdb-mapreduce/src/index.js#L477

https://github.com/pouchdb/pouchdb/blob/master/packages/node_modules/pouchdb-mapreduce/src/evalFunctionInVm.js

Test results using https://github.com/czyzm/TestViewsJx with JXcore and V8:

using evalFunctionInVm, the query is executed in 1.5s
replacing evalFunctionInVm with a direct call to view.mapFun, the query is executed in 0.5s

enricogior commented 7 years ago

updateViewInQueue(view) calls processNextBatch()

processNextBatch() calls tryCode(view.sourceDB, mapFun, [doc])

tryCode(db, fun, args) calls fun.apply(null, args) where fun maps to mapFun that in our case is mapFun = evalFunction(view.mapFun.toString(), emit);

evalFunction maps to evalFunctionInVm

The sandbox (aka VM) is created for every item in the result list and that causes a significant performance issue.

yaronyg commented 7 years ago

Note that this bug started life as thaliproject/jxcore#82, which itself started life from thaliproject/leveldown-mobile#4 (where there is sample code that repro's this issue).

yaronyg commented 7 years ago

So it seems like the core of our problem is that vm.runInNewContext runs 90x too slow on JXcore versus anywhere else. So we want to get rid of it.

I sent mail to Nolan looking at three options:

Don't use persistent indexes - If we hit Rockwell's code hard enough I'm sure we could get rid of the persistent views but probably only if we do really ugly things.
Put in a switch to get rid of evalFunction - I'm assuming that PouchDB runs the views through evalFunction (and hence vm.runInNewContext) for security reasons. We could add a switch saying 'for this DB just translate MAP into a function and run it directly'
Switch to pouchdb-find - This is the 'future' of indexes for PouchDB. If I'm following it correctly it doesn't need vm.runInNewContext or evalFunction because it doesn't use Javascript but a more restricted language that doesn't have the security issues of map/reduce. But I'm not sure if it's stable enough yet for us to use.

yaronyg commented 7 years ago

I talked with Nolan and there are a couple of options for us here. Using the list above:

If we can do this, this will be fastest. But it is tricky since the key lookup is pretty dumb.
We talked about the existing perf of the view code and there is a possibility here for two PRs. One PR is to store the eval output of a design doc so we don't have to do it too often. It's not clear how big a win that is. But the other PR is a switch to effectively disable vm.runInNewContext. But that is a scary switch since it has security implications. In Thali's case we would need to put a filter on our pull replications so we don't ever pull design docs. Honestly, we should be doing that anyway.
This is by far the best bet if we can make it work. Nolan said that the existing code is solid but it only implements about 80% of the Mango language which itself can only do about 90% of what you can do with map/reduce. So if we can get the index to work with pouchdb-find that is the easiest and most performant solution.

czyzm commented 7 years ago

I modified our test app to add pouchdb-find tests: TestViewsMobile

The results of query/find execution are as follow for 6000 documents: WebView with view: 20s WebView with index (find): 20s Jxcore with view: 300s Jxcore with index (find): 20s

As you can see the for WebView there is no difference - building the index for the first usage is similar like building the view. However, for jxcore we are getting significant improvement - we are getting the same result as in WebView.

czyzm commented 7 years ago

Quick update: Using pouchdb-find instead of views works for us so we are fine at this moment.

thaliproject / Thali_CordovaPlugin

PouchDB: evalFunctionInVm performance issue. #1627