Open SCdF opened 5 years ago
The impacts of convert a set of indexes to Mango were evaluated for feasibility and impact. In the first round of evaluations, the six MapReduce views which are warmed during bootstrap were evaluated.
MapReduce View | Feasibility |
---|---|
medic-user/read | We use this query with a reducer. Converting this to Mango has blocking bandwidth impacts for online users. This would be a good index to lazy load (#5859). |
medic-client/contacts_by_type | The sorting logic in this index cannot be reproduced in Mango. Users would need to fetch all documents and perform an in-memory sort. This has blocking bandwidth impacts for online users and poor (unmeasured) characteristics for offline users. |
medic-client/data_records_by_type | We use this query with a reducer. Converting this to Mango has blocking bandwidth impacts for online users. This would be a good index to lazy load (#5859). |
medic-client/reports_by_validity | I believe it is a bug that we build this index during startup (#5866) |
medic-client/forms | Feasible |
medic-client/docs_by_id_lineage | This MapReduce view allows for selecting a document based on the content of another document. This is not possible via Mango queries. |
Execution Times - Measured via 100x tight loop
Index | Device | MapReduce | Mango | Delta |
---|---|---|---|---|
'type' field only | Tecno F1 | 12,403 | 14,158 | +1,755 (+14.1%) |
'type' field only | Desktop | 872 | 944 | +72 (+8.3%) |
'type' + '_attachments.xml' | Desktop | 872 | 959 | +83 (+9.5%) |
Scripts:
(() =>{
const start = performance.now();
let chain = Promise.resolve();
for (let i = 0; i < 100; i ++) {
chain = chain.then(() => PouchDB('medic-user-ac1').query('medic-client/forms', { include_docs: true }));
}
chain.then(() => console.log('MapReduce Execution Time', performance.now() - start));
})();
(() =>{
const start = performance.now();
let chain = Promise.resolve();
for (let i = 0; i < 100; i ++) {
chain = chain.then(() => PouchDB('medic-user-ac1').find({ selector: { type: 'form', '_attachments.xml': { $exists: true }, }, }));
}
chain.then(() => console.log('Mango Execution Time', performance.now() - start));
})();
Build PouchDB Index - Based on sample of 3 measures
Device | MapReduce | Mango | Delta |
---|---|---|---|
Tecno F1 | 13,895 | 18,022 | +4127 (+29.7%) |
Desktop | 1124 | 2096 | +972 (+86%) |
Scripts:
(() =>{
const start = performance.now();
PouchDB('medic-user-ac1').createIndex({ index: { fields: ['type'] } })
.then(idx => {
console.log('Index', idx, performance.now() - start);
return PouchDB('medic-user-ac1').deleteIndex({ ddoc: idx.id, name: idx.name });
}).then(console.log);
})();
(function() {
const start = performance.now();
let chain = Promise.resolve();
chain = chain.then(() => PouchDB('medic-user-ac1').query('medic-client/forms', { limit: 0 }));
chain.then(() => {
console.log('MapReduce', performance.now() - start);
window.indexedDB.deleteDatabase('_pouch_medic-user-ac1-mrview-bc4e9efc3baf76a2da15c82a700c0908');
});
})();
Other Metrics
Metric | MapReduce | Mango | Delta |
---|---|---|---|
Index disk use PouchDB | 712 | 324 | -388 (-54%) |
Index heap use PouchDB | 0 | 0 | 0 |
Bandwidth for online users CouchDB | 11469 | 10058 | -1411 (-12%) |
Inbox.js script size with pouchdb-find | 2,962,350 bytes | 3,013,512 bytes | +51.1 kB (+1.7%) |
Measure IndexedDB Disk Use (Chrome 70 only): await window.navigator.storage.estimate()
Bandwidth Scripts:
curl http://admin:pass@localhost:5984/medic/_design/medic-client/_view/forms?include_docs=true -w '%{size_download}'
curl http://admin:pass@127.0.0.1:5984/medic/_find -X POST -H 'Content-Type: application/json' --data '{"selector": {"type": "form", "_attachments.xml":{"$exists":true}}}' -w '%{size_download}'
Based on these findings, Mango is not particularly well suited to help with the WebApp's bootstrapping. It is likely that Mango is better suited for use outside of the webapp's performance hot paths. Some potentially fruitful options are to use it in API/Sentinel. Or within WebApp, the filtered search indexes should be investigated as well as less hot performance code paths like editing user settings (doc_by_type
).
To help scope down 3.7, and because it is believed that IDBNext is likely to impact the performance characteristics of Mango and indexing - it was recommended that this investigation be continued after the IDBNext work.
Chatted to Kenn. Things to try in the future:
doc_by_type
maybe) and see if this is an aberration or notDeferring to 3.9.0
Let's wait for IDBNext to land and then run this again.
We currently query for data records in PouchDB / CouchDB via mapreduce queries.
Each map-reduce defined query creates its own index that needs to be kept up to date as data changes.
Mango splits the creation of the index from the use of it in a query. So, if we can convert many of our mapreduce queries to Mango, we can reduce how many indexes that need to be generated.
An example of looking at this for startup is here: https://github.com/medic/medic/pull/5264 (along with some other changes)
We need to be careful and measure performance, in both offline and online situations, both for query speed and individual index generation time, as well as network costs in the online scenario, and balance that against the fact that Mango allows for less indexes. It's complicated!
One core difference is that Mango queries are always equivalent to
include_docs: true
in relation to how CouchDB pulls data off disk into memory. One example of how this affects things is that a mapreduce view which doesn't include_docs will query much faster than the equivalent Mango query, when running in CouchDB (PouchDB should be equivalent).The core goal is to reduce the count of indexes generated locally, while making sure we do not accidentally degrade performance elsewhere.