Closed GoogleCodeExporter closed 9 years ago
Could you provide some more information on how this is handled in the source as
I might be able to contribute to improvements. From a brief inspection of the
source it looks like "obtainXMLResult" does most of the work. If it is using a
DOM-object for this then it should be faster to use a stream which uses less
memory.
Original comment by charlie....@clark-consulting.eu
on 4 Jan 2013 at 6:18
Parsing is now 100x faster.
The issue was divvying up the jobs across multiple parsing scripts. That's done
using a modulo that maps to the parse task #. Unfortunately, the original code
did NOT zero-base the task #, so two processes were handling the same jobs.
This caused duplicate records which required a REPLACE (rather than INSERT).
The REPLACE caused all the other processes to lock for ~6 seconds.
The fix was to just do INSERT and ignore the second duplicate.
Original comment by stevesou...@gmail.com
on 9 Jan 2013 at 8:20
Original issue reported on code.google.com by
stevesou...@gmail.com
on 19 Dec 2012 at 5:24