Closed Moonbase59 closed 2 years ago
It seems the previous step failed. The downloaded XML is incomplete/truncated?
Hm. Should I re-download? Shows 4.6 GB here.
Perfect, you’re correct. Wonder why it truncated, now shows 5 GB! Thanks!
Hm. Now it will still not finish, trying again tomorrow…
matthias@e6510:~/Projekte/ebook-reader-dict$ python3 -m wikidict fr --parse
>>> Processing data/fr/pages-20220120.xml ...
>>> Saved 1,882,269 words into data/fr/data_wikicode-20220120.json
>>> Parse done!
matthias@e6510:~/Projekte/ebook-reader-dict$ python3 -m wikidict fr --render
>>> Loading data/fr/data_wikicode-20220120.json ...
>>> Loaded 1,882,269 words from data/fr/data_wikicode-20220120.json
!! Missing 'ar-cf' template support for word 'Djamel'
!! Missing 'ar-cf' template support for word 'azulejo'
!! Missing 'ar-cf' template support for word 'Ali'
!! Missing 'ar-cf' template support for word 'alcade'
!! Missing 'ar-cf' template support for word 'Mourad'
!! Missing 'ar-cf' template support for word 'cadi'
!! Missing 'ar-cf' template support for word 'Zahra'
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/usr/lib/python3.8/multiprocessing/pool.py", line 48, in mapstar
return list(map(*args))
File "/home/matthias/Projekte/ebook-reader-dict/wikidict/render.py", line 397, in render_word
words[word] = details
File "<string>", line 2, in __setitem__
File "/usr/lib/python3.8/multiprocessing/managers.py", line 835, in _callmethod
kind, result = conn.recv()
File "/usr/lib/python3.8/multiprocessing/connection.py", line 250, in recv
buf = self._recv_bytes()
File "/usr/lib/python3.8/multiprocessing/connection.py", line 414, in _recv_bytes
buf = self._recv(4)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 383, in _recv
raise EOFError
EOFError
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/matthias/Projekte/ebook-reader-dict/wikidict/__main__.py", line 122, in <module>
sys.exit(main())
File "/home/matthias/Projekte/ebook-reader-dict/wikidict/__main__.py", line 65, in main
return render.main(args["LOCALE"])
File "/home/matthias/Projekte/ebook-reader-dict/wikidict/render.py", line 444, in main
words = render(in_words, locale)
File "/home/matthias/Projekte/ebook-reader-dict/wikidict/render.py", line 414, in render
pool.map(partial(render_word, words=results, locale=locale), in_words.items())
File "/usr/lib/python3.8/multiprocessing/pool.py", line 364, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/usr/lib/python3.8/multiprocessing/pool.py", line 771, in get
raise self._value
EOFError
First time I see that error :thinking:
If you run again the command, it still fails?
Yup, it did yesterday. Now trying with fresh pulls of the project and pyglossary., and workers=4.
No success. Gets me some broken pipes:
matthias@e6510:~/Projekte/ebook-reader-dict$ python3 -m wikidict fr --render --workers=4
>>> Loading data/fr/data_wikicode-20220120.json ...
>>> Loaded 1,882,269 words from data/fr/data_wikicode-20220120.json
!! Missing 'ar-cf' template support for word 'azulejo'
!! Missing 'ar-cf' template support for word 'Ali'
!! Missing 'ar-cf' template support for word 'alcade'
!! Missing 'ar-cf' template support for word 'Mourad'
Process ForkPoolWorker-4:
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/pool.py", line 131, in worker
put((job, i, result))
File "/usr/lib/python3.8/multiprocessing/queues.py", line 368, in put
self._writer.send_bytes(obj)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 200, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/usr/lib/python3.8/multiprocessing/connection.py", line 405, in _send_bytes
self._send(buf)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.8/multiprocessing/pool.py", line 136, in worker
put((job, i, (False, wrapped)))
File "/usr/lib/python3.8/multiprocessing/queues.py", line 368, in put
self._writer.send_bytes(obj)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 200, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/usr/lib/python3.8/multiprocessing/connection.py", line 404, in _send_bytes
self._send(header)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
Getötet
matthias@e6510:~/Projekte/ebook-reader-dict$ Process ForkPoolWorker-3:
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/pool.py", line 131, in worker
put((job, i, result))
File "/usr/lib/python3.8/multiprocessing/queues.py", line 368, in put
self._writer.send_bytes(obj)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 200, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/usr/lib/python3.8/multiprocessing/connection.py", line 404, in _send_bytes
self._send(header)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.8/multiprocessing/pool.py", line 136, in worker
put((job, i, (False, wrapped)))
File "/usr/lib/python3.8/multiprocessing/queues.py", line 368, in put
self._writer.send_bytes(obj)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 200, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/usr/lib/python3.8/multiprocessing/connection.py", line 404, in _send_bytes
self._send(header)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
Process ForkPoolWorker-5:
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/pool.py", line 131, in worker
put((job, i, result))
File "/usr/lib/python3.8/multiprocessing/queues.py", line 368, in put
self._writer.send_bytes(obj)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 200, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/usr/lib/python3.8/multiprocessing/connection.py", line 404, in _send_bytes
self._send(header)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.8/multiprocessing/pool.py", line 136, in worker
put((job, i, (False, wrapped)))
File "/usr/lib/python3.8/multiprocessing/queues.py", line 368, in put
self._writer.send_bytes(obj)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 200, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/usr/lib/python3.8/multiprocessing/connection.py", line 404, in _send_bytes
self._send(header)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
Process ForkPoolWorker-2:
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/pool.py", line 131, in worker
put((job, i, result))
File "/usr/lib/python3.8/multiprocessing/queues.py", line 368, in put
self._writer.send_bytes(obj)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 200, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/usr/lib/python3.8/multiprocessing/connection.py", line 404, in _send_bytes
self._send(header)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.8/multiprocessing/pool.py", line 136, in worker
put((job, i, (False, wrapped)))
File "/usr/lib/python3.8/multiprocessing/queues.py", line 368, in put
self._writer.send_bytes(obj)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 200, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/usr/lib/python3.8/multiprocessing/connection.py", line 404, in _send_bytes
self._send(header)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
matthias@e6510:~/Projekte/ebook-reader-dict$
Phew, success. Needed to close the browser and all other apps (GoldenDict, MathPix, Shutter) and succeeded just barely with workers=2
.
matthias@e6510:~/Projekte/ebook-reader-dict$ python3 -m wikidict fr --render --workers=2
>>> Loading data/fr/data_wikicode-20220120.json ...
>>> Loaded 1,882,269 words from data/fr/data_wikicode-20220120.json
!! Missing 'ar-cf' template support for word 'Ali'
!! Missing 'ar-cf' template support for word 'alcade'
>>> Saved 1,794,376 words into data/fr/data-20220120.json
>>> Render done!
matthias@e6510:~/Projekte/ebook-reader-dict$ python3 -m wikidict fr --convert
>>> Loading data/fr/data-20220120.json ...
>>> Loaded 1,794,376 words from data/fr/data-20220120.json
>>> Generated dict-fr-fr.df (122,195,089 bytes)
>>> Generated dicthtml-fr-fr.zip (35,634,409 bytes)
>>> Generated dict-fr-fr.df.bz2 (21,047,994 bytes)
>>> Generated dict-fr-fr.zip (31,322,906 bytes)
matthias@e6510:~/Projekte/ebook-reader-dict$
Just downloaded the FR dump (as of 2022-01-20) and trying to parse it:
Output: