zhujiangang / wikixmlj

Automatically exported from code.google.com/p/wikixmlj
0 stars 0 forks source link

SAX parser code insanely slow #9

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Doing concat on the strings for SAXPageCallbackHandler, is amazingly slow.

Changing it to simply use a StringBuffer:

    private StringBuffer currentWikitext;
    private StringBuffer currentTitle;

and changing concat to do append, 

I dunno. gives maybe a 40x increase in speed?

Original issue reported on code.google.com by ianupri...@gmail.com on 2 Apr 2010 at 12:18

GoogleCodeExporter commented 9 years ago
This is now fixed. Thanks to David Andrzejewski for the patch. I will do some 
profiling later and post how much speedup we gained but looks like a sensible 
thing to do.

Original comment by delip...@gmail.com on 12 Nov 2010 at 1:39

GoogleCodeExporter commented 9 years ago
shouldn't you use a StringBuilder instead? Do you really need the 
synchronisation that the StringBuffer provides?

Original comment by christop...@gmail.com on 30 Jul 2011 at 11:11

GoogleCodeExporter commented 9 years ago
We are already using StringBuilder. See 
http://code.google.com/p/wikixmlj/source/browse/trunk/src/edu/jhu/nlp/wikipedia/
SAXPageCallbackHandler.java

Original comment by delip...@gmail.com on 30 Jul 2011 at 9:58