Closed estellecomment closed 7 years ago
Related to #2692 ?
Yeah, similar, but #2692 is being solved by exporting less logs, which won't work here.
I'm going to take a crack at solving this with streaming
@estellecomment I've created https://github.com/medic/medic-api/pull/99 as a POC for streaming XML results. It would be good if you could test it locally to make sure that reduces the memory usage enough for that DB you have.
Also, how exactly were you testing it? Through a UI, or curling api?
Through curl :
curl -X POST http://admin:pass@localhost:5988/api/v1/export/messages
-> works, both on master
and your branch.curl -X POST http://admin:pass@localhost:5988/api/v1/export/messages?format=xml
-> out-of-memory on master
, Server error
on your branch : UNCAUGHT EXCEPTION!
Server error:
{ [TypeError: this._implicitHeader is not a function]
domain:
Domain {
domain: null,
_events: { error: [Function] },
_eventsCount: 1,
_maxListeners: undefined,
members: [] },
domainThrown: true }
undefined
So you'll need to fix format=xml
to get my before-after input!
(I'm getting 91301 messages, that's a lot of messages)
Edit : same result with GET. I'm using POST because that's what webapp's Export page is using.
Interesting, so that's a bug in the XML framework that isn't happening when we run the tests, even though it's going through the same code paths…
Server error:
{ TypeError: this._implicitHeader is not a function
at XMLDocumentCB.OutgoingMessage.write [as onDataCallback] (_http_outgoing.js:433:10)
at XMLDocumentCB.module.exports.XMLDocumentCB.onData (/Users/scdf/Code/Medic/medic-api/node_modules/xmlbuilder/lib/XMLDocumentCB.js:302:19)
at XMLDocumentCB.module.exports.XMLDocumentCB.declaration (/Users/scdf/Code/Medic/medic-api/node_modules/xmlbuilder/lib/XMLDocumentCB.js:190:12)
at XMLDocumentCB.module.exports.XMLDocumentCB.dec (/Users/scdf/Code/Medic/medic-api/node_modules/xmlbuilder/lib/XMLDocumentCB.js:335:19)
at /Users/scdf/Code/Medic/medic-api/controllers/export-data.js:454:8
I'll dig into it, seems weird
OK wow check this out: https://github.com/medic/medic-api/pull/99/commits/37c2de1eb7496b58aa15b6da9fd6217ad05d93b3
One of those times I realise I don't really understand JavaScript at all… anyway @estellecomment give that another crack, it now works for me locally.
Booooom But much further along the process. Got 59500 / 91301 rows.
::1 - admin [02/Sep/2016:13:14:13 +0000] "GET /api/v1/export/messages?format=xml HTTP/1.1" - - "-" "curl/7.43.0"
<--- Last few GCs --->
224276 ms: Scavenge 1398.0 (1455.3) -> 1398.0 (1455.3) MB, 0.8 / 0 ms (+ 2.3 ms in 1 steps since last GC) [allocation failure] [incremental marking delaying mark-sweep].
225320 ms: Mark-sweep 1398.0 (1455.3) -> 1398.0 (1455.3) MB, 1044.4 / 0 ms (+ 3.6 ms in 2 steps since start of marking, biggest step 2.3 ms) [last resort gc].
226365 ms: Mark-sweep 1398.0 (1455.3) -> 1398.0 (1455.3) MB, 1045.1 / 0 ms [last resort gc].
<--- JS stacktrace --->
==== JS stack trace =========================================
Security context: 0x206170a44a49 <JS Object>
1: replace(aka replace) [native string.js:~152] [pc=0x1d88423c0e18] (this=0x26470193b691 <String[29]: 30, Dec 2015, 19:56:23 +01:00>,P=0x2a53d7cb6469 <JS RegExp>,Q=0x1f3851620d99 <String[5]: &>)
2: text [/Users/estellecomment/dev/medic/medic-webapp-2/api/node_modules/xmlbuilder/lib/XMLDocumentCB.js:~121] [pc=0x1d8842374ae3] (this=0x1a2f5d3377c9 <an XMLDocumentCB with map 0x388456048...
FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - process out of memory
Abort trap: 6
OK so I'll move on to the next optimisation, which will be batching view calls to CouchDB instead of getting everything in one big go.
Hrm, it should be streaming from the couchdb list function and we can just proxy that?
I might be hallucinating about old code...
Yeah I think you might be. The existing stuff in master very linearly does this:
Batching the queries is going to be interesting, because of lucene being a potential option there as well.
Hrm this might be a job for a couchdb list function, and stream results where each row can be parsed... as opposed to using a view?
Maybe add a list function to stream CSV from couchdb and transform it in medic-api.
Why? As in, what is the benefit of doing this over simply calling the view again and again with a different offset?
Might be more straightforward code wise and more efficient, seems like something the database should provide.
Just a suggestion I thought worth exploring if batching is difficult.
FWIW this is basically exactly what we want: https://github.com/dominictarr/JSONStream
The downside is that some of our export generation is more complicated than I thought, and I'm not actually convinced they can all be streamed, at least not without quite a bit of re-writing. Still investigating.
i.e. the forms export has an overview tab that requires mapping over the whole view result first, and should be the "first" XML result to be exported (ie you're streaming out tab by tab, the summary tab is first).
@estellecomment can you pull that branch and give your test command another crack? I very much doubt I've fixed it but actually fixing it looks like a larger piece of work than I was expecting, so I thought I'd try a silly random hack (that language optimisers hate!)
Nope! Exploded at 62.6MB again. 59496 / 91301 messages.
$ curl -X GET http://admin:pass@localhost:5988/api/v1/export/messages?format=xml > tmp3.xml
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 62.6M 0 62.6M 0 0 274k 0 --:--:-- 0:03:53 --:--:-- 10290curl: (18) transfer closed with outstanding read data remaining
Api says :
::1 - admin [05/Sep/2016:15:53:22 +0000] "GET /api/v1/export/messages?format=xml HTTP/1.1" - - "-" "curl/7.43.0"
<--- Last few GCs --->
295196 ms: Scavenge 1398.1 (1455.3) -> 1398.1 (1455.3) MB, 0.8 / 0 ms (+ 2.3 ms in 1 steps since last GC) [allocation failure] [incremental marking delaying mark-sweep].
296182 ms: Mark-sweep 1398.1 (1455.3) -> 1398.1 (1455.3) MB, 986.0 / 0 ms (+ 3.6 ms in 2 steps since start of marking, biggest step 2.3 ms) [last resort gc].
297161 ms: Mark-sweep 1398.1 (1455.3) -> 1398.0 (1455.3) MB, 979.5 / 0 ms [last resort gc].
<--- JS stacktrace --->
==== JS stack trace =========================================
Security context: 0x249aeb244a49 <JS Object>
1: /* anonymous */(aka /* anonymous */) [/Users/estellecomment/dev/medic/medic-webapp-2/api/controllers/export-data.js:485] [pc=0x152e26a3c30e] (this=0x249aeb204139 <undefined>,cell=0x22b7226dd129 <String[32]: 4d803109b977d8a88350eadee42d56da>)
2: arguments adaptor frame: 3->1
3: InnerArrayForEach(aka InnerArrayForEach) [native array.js:~942] [pc=0x152e26ad5cc5] (this=0x249aeb204139...
FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - process out of memory
Abort trap: 6
So talked to Alex and Gareth about this one: it's quite a lot of effort to re-write the API cleanly to stream from A->Z, which is really the only reliable way of fixing this issue without caveats and hacks.
These APIs are very "inherited", in the sense that it's not clear how many of the various file formats and output types are being actively used. No client has complained about these exports crashing, so you could conclude that in the field they work well enough, and for people where they wouldn't they aren't used.
So: I'm going to merge what I did: since it will help with reducing memory usage in general, and we're going to leave it at that for now. If a client has trouble in the field we can solve the issue temporarily by increasing the allowed nodejs memory usage, and use their issue as an opportunity to look at this in more detail. I'd recommend making a "2.0" api that supports streaming from the start, and moving things over to that model that are provably used.
hi frand @browndav, U is coot
Please close or schedule before the end of this sprint. See triaging old issues.
Nice Snek
Closing.
I still don't understand why medic-api can't simply stream the output from the list function in couchdb that is already streaming. Originally this worked just fine because we called the couchdb list function directly, it only started to fail when we added the medic-api layer incorrectly (not streaming).
Tried for a couple minutes to export 878 messages, then API ran out of memory and crashed. (I can export forms and contacts just fine. This is on a clone of CA2, upgraded.)