Open GoogleCodeExporter opened 9 years ago
Revision 28/29
Faster hyphenateWord() by applying some tweaks. (Takes now around 80% of time).
Original comment by mathiasn...@gmail.com
on 4 Jan 2008 at 9:08
I (re)tried to implement a packed trie in javascript as proposed by Liang.
A trie could save a lot of lookups - BUT
* It blows up the code: I had to implement a whole new Trie-Class, with Nodes
and a find-Function -> the
Script takes longer to load.
* It blows up the data-structure: for each Node a field to higher branch, lower
branch and equal branch is
used. This is more than can be saved by reusing of Nodes (e.g. the first two
nodes of "_aal" and "_abas" are
the same, so 2 chars could be saved but it takes a lot more to implement this!)
-> the patterns take longer to
load.
Conclusion: it sucks! Lot of work done for nothing!
Original comment by mathiasn...@gmail.com
on 7 Feb 2008 at 8:59
I did a little speed comparison: /needle/.test('haystack') vs
'haystack'.indexOf('needle')!=-1
The regex-way is up to three times slower, than the indexOf-way.
If the additional regex-functionality isn't used, use indexOf; it's uglier but
faster.
(I'm speaking of nanoseconds here! I try everything... ;-)
Original comment by mathiasn...@gmail.com
on 7 Feb 2008 at 9:07
For earlier script execution and preventing redraws, see:
http://code.google.com/p/hyphenator/issues/detail?
id=11
Original comment by mathiasn...@gmail.com
on 7 Feb 2008 at 9:13
Did some performance tunings in Revision 60:
saved:
FF 3b2: 27%
Safari 3: 16%
Webkit NB: 8%
Opera: 13%
IE7: 27%
IE6 20%
Original comment by mathiasn...@gmail.com
on 11 Feb 2008 at 11:39
I changed the format of the pattern files from JSON to string.
Pattern files are now much smaller to download. It takes about 200ms in IE on
my Intel MacBook Pro 2.2GHz to
convert them to an object. Other browsers are faster.
I had to change a lot.
(Revision 188)
Original comment by mathiasn...@gmail.com
on 27 Oct 2008 at 4:42
I fastened the hyphenateWord-function in rev 242 a little bit by removing the a
'if'-statement:
if (!!pat) {
Original comment by mathiasn...@gmail.com
on 5 Dec 2008 at 7:09
With rev 275 I added a cache for hyphenated words and a switch to turn caching
off (witch is on by default).
This enhances overall performace especially for long texts, where the
probability of repeated words is high.
Original comment by mathiasn...@gmail.com
on 3 Jan 2009 at 5:43
I had an idea:
Pattern files probably could be minified:
Sorting the patterns by their length.
Patterns with the same length are concatenated (no spaces necessary) and saved
in an object. ({'1':'patterns of
length 1','2':'patterns of length 2',...})
Later they are splitted after the indicated length and the processed as usual.
en.js has 4446 spaces in the pattern string this are 4KB (~14%)
de.js has 12585 spaces (12.5KB <=> ~15% of the file size)
Original comment by mathiasn...@gmail.com
on 21 Feb 2009 at 5:12
Forgot to mention: the new pattern format works great since r377 in the trunk
Original comment by mathiasn...@gmail.com
on 9 Mar 2009 at 1:43
Paul Arzul wrote (per e-mail):
--begin cite--
[…]
i find the savings of http compression to be sufficient optimization
for most use cases, while not (albeit unintentionally) obfuscating the
source. debugging compressed javascript is also difficult and
javascript decompression and evaluation plays a significant part in
delaying render:
http://batiste.dosimple.ch/blog/2007-07-02-1/
maybe encourage installation to merge javascript files if the
languages are well predefined as that should help:
$ cat Hyphenator_debug.js patterns/en.js > merge.js
--- end cite ---
Some investigations about the best practice in minifying/deflating and merging
could be interesting, though.
How about a script/service that builds a Hyphenator.js with the patterns
included?
Original comment by mathiasn...@gmail.com
on 8 Apr 2009 at 1:56
Merging code is ow on issue57
Original comment by mathiasn...@gmail.com
on 4 May 2009 at 8:00
Added some //todo comments in r534: improvement possible
Original comment by mathiasn...@gmail.com
on 4 May 2009 at 8:01
unified different RegEx'es in one: it's now simpler
Original comment by mathiasn...@gmail.com
on 6 May 2009 at 6:30
For now, Safari 4 and Firefox 3.5 support them.
http://ejohn.org/blog/web-workers/
http://www.whatwg.org/specs/web-workers/current-work/
I'll have to give it a try and check if there's a gain in performance...
Original comment by mathiasn...@gmail.com
on 27 Jul 2009 at 7:30
Web workers sound like a great idea to explore for this application.
Original comment by aphahn
on 27 Jul 2009 at 9:23
Actually I just began to do my first trials with WW (have a look at
http://hyphenator.googlecode.com/svn/trunk/testsuite/test57.html)
It works (some features are still missing) but it's still slower
Notes:
1.
The script currently creates a WW fo each element that has been marked for
hyphenation, then, each text
node of that element is sent to it. -> Elements are hyphenated async at the
same time
First I tried to create one WW and send words to it. But it's async, therefore
I don't know, where the
hyphenated words belong to...
Would be nice to have one single WW -> load the script only once (SharedWorkers
are not yet supported by FF
nor Wk)
2. Would be nice to have a routine to wait for all textnodes to be returned and
then replaced at once
3. I have to find a way to get the settings into the Worker (hyphenchar etc.)
Original comment by mathiasn...@gmail.com
on 4 Aug 2009 at 10:35
Performance highly improved by using DOM-Storage to cache pattern-files.
(since version 3.0.0)
Original comment by mathiasn...@gmail.com
on 9 Jul 2010 at 9:40
See issue104 about ongoing work on Web Workers
Original comment by mathiasn...@gmail.com
on 2 Aug 2010 at 9:25
Performance improved by using an object to store information about elements
instead of Expando
(test15: before: ~300ms, after ~200ms)
Original comment by mathiasn...@gmail.com
on 8 Jun 2011 at 10:58
Performance improved by using Bram Steins Trie-based algorithm for hyphenation:
https://github.com/bramstein/Hypher
Original comment by mathiasn...@gmail.com
on 13 Jun 2011 at 9:43
Perf improved by making pattern-availability check async
Original comment by mathiasn...@gmail.com
on 30 Aug 2011 at 6:24
see http://code.google.com/p/hyphenator/issues/detail?id=170
Original comment by mathiasn...@gmail.com
on 15 Nov 2012 at 4:12
Fixed some deoptimizations of function hyphenateWord() and prepareLanguage() in
Chrome (r1235 , r1236, r1237).
Original comment by mathiasn...@gmail.com
on 7 Oct 2014 at 8:21
r1319
Using setTimeout instead of setInterval leads to faster execution:
- pause time can be lowered to 10ms (instead of 100ms)
- convert-function is not interrupted by interval-calls
Original comment by mathiasn...@gmail.com
on 21 Jan 2015 at 3:00
Original issue reported on code.google.com by
mathiasn...@gmail.com
on 3 Jan 2008 at 2:50