Closed mayli closed 13 years ago
I will look into this. One thing I see though is that you have a lot of timeouts. What kind of machine are you running this on, how many cores? Can you try running with --timeout=60? You may also want to try --nomp - that disables multiprocessing which may give better error messages.
Works for me. I still had few timeout at --timeout=60, but none with --timeout=120:
$ aardc wiki zhwiki-20110521-1.cdb --siteinfo zh.json --timeout 120 --lang-links de,el,en,es,fr,it,la,nl,pl,pt,ru
...
100.00% t: 13:37:29 avg: 13.6/s a: 349606 r: 315493 s: 0 e: 0 to: 0 f: 1
Looks like one article failed due to too many recursive template invocations. Compiled dictionary is here: http://aard.tk/zhwiki/
Using Ubuntu 10.04 and follow the intructions here http://aarddict.org/aardtools/doc/aardtools.html. It works fine dumping the wiki to cdb, but when 6 hours compiling after, it just break at 24.81%.
Here is the compiler output.
(env-aard)mayli@matrix:~$ aardc wiki zhwiki-20110521-pages-articles.cdb --siteinfo zh.json Session dir ./aardc-1306893341-52 Writing log to ./aardc-1306893341-52/log Converting zhwiki-20110521-pages-articles.cdb total: 665100 24.81% t: 6:11:46 avg: 7.4/s a: 73101 r: 90087 s: 0 e: 0 to: 1798 f: 0 Compiling .aar files Creating volume 1 Wrote volume 1 zhwiki-20110521-pages-articles.aar.1 sha1: 338e89ebfdf77043ec156e258ecd397dbda13cd1 Created zhwiki-20110521-pages-articles.aar Compilation took 6:12:01
Here is the log file (tail):
..... 16:06:48 WARNING [wiki] Worker pool timed out 16:06:48 INFO [wiki] Terminating current worker pool 16:06:48 INFO [wiki] Creating new worker pool with wiki cdb at zhwiki-20110521-pages-articles.cdb 16:07:00 WARNING [aardtools.mwaardhtmlwriter] Could not render math in u'\u51af\u8bfa\u4f9d\u66fc\u7a33\u5b9a\u6027\u5206\u6790' with 'latex': Couldn't convert equation '\n\begin{align}\n \epsilon_j^n & = e^{at} e^{ik_m x} \\n \epsilon_j^{n+1} & = e^{a(t+\Delta t)} e^{ikm x} \\n \epsilon{j+1}^n & = e^{at} e^{ikm (x+\Delta x)} \\n \epsilon{j-1}^n & = e^{at} e^{ik_m (x-\Delta x)},\n\end{align}\n' (failed cmd: 'latex -halt-on-error -output-directory /tmp/math-2BaLkB /tmp/math-2BaLkB/eq.tex', error: '') 16:07:00 WARNING [aardtools.mwaardhtmlwriter] Could not render math in u'\u51af\u8bfa\u4f9d\u66fc\u7a33\u5b9a\u6027\u5206\u6790' with 'blahtex': Couldn't convert equation '\n\begin{align}\n \epsilon_j^n & = e^{at} e^{ik_m x} \\n \epsilon_j^{n+1} & = e^{a(t+\Delta t)} e^{ikm x} \\n \epsilon{j+1}^n & = e^{at} e^{ikm (x+\Delta x)} \\n \epsilon{j-1}^n & = e^{at} e^{ik_m (x-\Delta x)},\n\end{align}\n' (failed cmd: 'blahtexml --texvc-compatible-commands --png --temp-directory /tmp/math-d2hSAh --png-directory /tmp/math-d2hSAh', error: 'Unrecognised command "\begin{align}"') 16:07:28 WARNING [wiki] Worker pool timed out 16:07:28 INFO [wiki] Terminating current worker pool 16:07:28 INFO [wiki] Creating new worker pool with wiki cdb at zhwiki-20110521-pages-articles.cdb 16:07:28 INFO [wiki] Creating new worker pool with wiki cdb at zhwiki-20110521-pages-articles.cdb 16:07:28 INFO [wiki] Creating new worker pool with wiki cdb at zhwiki-20110521-pages-articles.cdb 16:07:28 INFO [compiler] Compiling zhwiki-20110521-pages-articles.aar 16:07:29 INFO [compiler] Creating temporary index 1 file /home/mayli/aardc-1306893341-52/index1dExDHK 16:07:29 INFO [compiler] Creating temporary index 2 file /home/mayli/aardc-1306893341-52/index2fQ_ZGn 16:07:29 INFO [compiler] Creating temporary articles file /home/mayli/aardc-1306893341-52/articlesTKLswW 16:07:39 INFO [compiler] Creating volume 1 16:07:42 INFO [compiler] Done with zhwiki-20110521-pages-articles.aar.1 16:07:42 INFO [compiler] Wrote volume 1 16:07:42 INFO [compiler] Writing volume count 1 to all volumes as >H 16:07:42 INFO [compiler] Calculating checksum for zhwiki-20110521-pages-articles.aar.1 16:07:43 INFO [compiler] zhwiki-20110521-pages-articles.aar.1 sha1: 338e89ebfdf77043ec156e258ecd397dbda13cd1 16:07:43 INFO [compiler] Renaming zhwiki-20110521-pages-articles.aar.1 ==> zhwiki-20110521-pages-articles.aar 16:07:43 INFO [compiler] total: 665100, skipped: 0, failed: 0, empty: 0, timed out: 1798, articles: 73101, redirects: 90087, average: 7.39/s elapsed: 6:12:01 16:07:43 INFO [compiler] Compression: _zlib - 63331, none - 90009, _bz2 - 9850 16:07:43 INFO [compiler] Compilation took 6:12:01.425855