Open desb42 opened 5 years ago
Yeah, this looks like a threading mechanism. The strange thing is that I don't think I've ever seen this error in my builds, and I've built enwiki at least a few dozen times.
Just like #526, let me try to reproduce this on my side in the next few weeks. Will report here. Thanks!
for my mass_parse.init I use this
add ('en.wikipedia.org' , 'wiki.mass_parse.init') {cfg {ns_ids = '0|4|8|12|14|100';}}
Perhaps you choose a different set of namespaces
I thought I would take a closer look at this error As this only occurs on an html build, it took a bit to reproduce.
For experimentation I used enwikibooks because its quite small The only namespace I used was Category: (14)
add ('en.wikibooks.org' , 'wiki.mass_parse.init') {cfg {ns_ids = '14';}}
The first change I made was to 400_xowa\src\gplx\xowa\addons\wikis\ctgs\htmls\catpages\Xoctg_catpage_mgr.java in Get_by_cache_or_null, I changed the synchronized to a static ReentrantReadWriteLock I believe that even with many instances of Xoctg_catpage_mgr this lock ensures that the access to the database calls is stricktly one at a time
This did not in fact make any difference
I then displayed all the database calls, by all the treads and this lead to the fact that, if there is a page in one thread accessing a dynamicpagelist and in another page a #PAGESINCATEGORY (and only when the wind is right)
In 400_xowa\src\gplx\xowa\wikis\data\tbls\Xowd_cat_core_tbl.java Select I replaced the synchronized with a call to the same static ReentrantReadWriteLock as in Xoctg_catpage_mgr.java (Probably not the right place) This seems to have cured the problem for enwikibooks - I have yet to try it on anything larger
The dynamicpagelist uses the category-core db with two other attached dbs and #PAGESINCATEGORY just uses the category-core db
SQLITE locks the attached table when category-core tables are being access
If the thread running #PAGESINCATEGORY is interrupted during the row reading and the dynamicpagelist thread continues on to a DETACH a lock error occurs
Hence the need to synchronize the threads (but not by using synchronized)
(The actual code I use is in myxowa project)
I have just had another thought about this.
Sqlite attach database
is on a per connection basis (according to my read of the Sqlite documentation).
If, as I believe, xowa only uses one database connection, this could lead to problems in the multi thread mode as attaching and detaching is performed a lot and perhaps could interfere causing this issue
I have just been rebuilding enwiki html (2019-06-01) and the build phase has finished - still waiting for the whole process to terminate (see #526)
In the logfile there are many SQLITE_ERROR entries There seem to be two styles of error one is 'db already in use'
and the other 'db locked'
These errors cannot be reproduced if I use xowa-gui to look at them individually
I suspect a multi thread conflict
I attach the full log session.zip
This may help with #526
One more thing, I am still using the attach category database mechanism