Closed brebs-gh closed 3 years ago
This issue has been mentioned on SWI-Prolog. There might be relevant details there:
https://swi-prolog.discourse.group/t/how-to-build-a-docker-image-with-swipl-and-jpl/4255/9
Running with:
USE_PUBLIC_NETWORK_TESTS=false ctest -j 16 --output-on-failure || true
shows:
` File "/home/brebs/apk/swipl/src/swipl-8.3.28/packages/language_server/python/test_prologserver.py", line 650, in test_server_options_and_shutdown self.assertEqual(afterShutdownThreads, initialThreads) AssertionError: Lists differ: ['mai[107 chars]er1_conn1_comm:running', 'language_server3_conn3_goal:running'] != ['mai[107 chars]er1_conn1_comm:running']
First list contains 1 additional elements. First extra element 5: 'language_server3_conn3_goal:running'
['main:running', 'gc:running', 'language_server1:running', 'language_server1_conn1_goal:running',
'language_server1_conn1_comm:running', ? ^
'language_server1_conn1_comm:running'] ? ^
'language_server3_conn3_goal:running']`
Compiling without -DCMAKE_BUILD_TYPE=PGO
stops the test phase from hanging. Still has the usual 3 test failures:
54 - semweb:con (SEGFAULT)
59 - semweb:rdf_db (SEGFAULT)
60 - semweb:subprop (SEGFAULT)
The hang probably relates to https://swi-prolog.discourse.group/t/difficult-to-reproduce-problems-while-running-tests/4266/20?u=jan. We expect a proper patch for that soon. The others are unclear.
With swi-prolog 8.3.29, PGO compilation combined with testing runs without hanging, with the 4 test failures as above:
46 - mqi:mqi (Failed) 54 - semweb:con (SEGFAULT) 59 - semweb:rdf_db (SEGFAULT) 60 - semweb:subprop (SEGFAULT)
The mqi test error is:
AssertionError: Lists differ: ['mai[59 chars]running', 'mqi1_conn1_comm:running', 'mqi3_conn3_goal:running'] != ['mai[59 chars]running', 'mqi1_conn1_comm:running']
I'm trying to have a look using Docker. I think this is running Alpine 3.14 (how to verify). Your deps refer to openjdk15, while openjdk11 seems the latest here. The package ossp-uuid-dev lacks as well (and is needed by mqi). Am I missing something?
I'm investigating the language_server failure above. It is happening on the last line of code below and means that the goal thread for the connection created using Unix Domain sockets did not go away.
stop_language_server/1
the predicate eventually calls thread_signal(Thread_ID, abort)
on the goal thread.thread_list
to get a list of active threads using Prolog thread_property
and compares them to the list before the test to see if any are hanging around that should have been aborted.I believe there could be a race in this test where the goal thread either hasn't reacted to the abort yet, or thread_property
hasn't been updated yet. Not entirely sure how the Prolog system works so I can't say for sure.
I added a couple of comments below that could test this assertion. If you put them into test_prologserver.py
and uncomment the sleep(5)
line you can see if that fixes it to test the race condition suspicion.
# unixDomainSocket() should be used if supplied (non-windows).
socketPath = mkdtemp()
unixDomainSocket = PrologServer.unix_domain_socket_file(socketPath)
result = monitorThread.query("language_server([unix_domain_socket('{}'), password(testpassword), server_thread(ServerThreadID)])".format(unixDomainSocket))
serverThreadID = result[0]["ServerThreadID"]
with PrologServer(launch_server=False, unix_domain_socket=unixDomainSocket, password="testpassword", prolog_path=self.prologPath) as newServer:
with newServer.create_thread() as prologThread:
result = prologThread.query("true")
self.assertEqual(result, True)
result = monitorThread.query("stop_language_server({})".format(serverThreadID))
self.assertEqual(result, True)
# Uncomment to see if it fixes it
# sleep(5)
afterShutdownThreads = self.thread_list(monitorThread)
self.assertEqual(afterShutdownThreads, initialThreads)
Running it verbose in the Alpine Docker indicates that, because ossp-uuid is lacking, uuid(UUID, [format(integer)])
fails. I extended the pure Prolog emulation of the UUID library and now the mqi test passes.
To clarify - I run Alpine "Edge" (basically the bleeding-edge), which becomes the next version of Alpine. Kinda like e.g. Debian Unstable.
To show the OS version, run:
cat /etc/os-release
ossp-uuid is only in Edge, in the testing repo (which is not enabled by default), so if you have removed its dependency then that's ideal :-)
For the openjdk 15, please change it to 11, or basically the latest available. I don't think Alpine's package manager has the concept of the latest version. In Edge I've got openjdk 9 to 16 available, visible with e.g.:
apk search openjdk | grep src | sort
Pushed SWI-Prolog/swipl-devel@836862aebcef781f82f41f078851405cade7bd5c, which seems to work around the crashes. Before, pushed an update to library(uuid) to provide the UUID services mqi requires in pure Prolog, so it works if ossp-uuid is not around.
Now trying the whole build on our Ci environment. It seems to build well, but there is a small issue with the reporting, so the result pages do not update properly. Will look at that tomorrow.
Enjoy 100% success at https://dev.swi-prolog.org/ci/home. You find the dependencies and config at https://github.com/SWI-Prolog/docker-swipl-linux-ci/tree/master/alpine/3.14
This issue has been mentioned on SWI-Prolog. There might be relevant details there:
https://swi-prolog.discourse.group/t/how-to-build-a-docker-image-with-swipl-and-jpl/4255/10
Below is an APKBUILD file, to package swi-prolog for Alpine Linux (which uses Musl instead of glibc). It has ".txt" appended to its filename, to upload it as a file here.
APKBUILD.txt
With swi-prolog 8.3.27 (and many previous versions), these tests fail:
53 - semweb:con (SEGFAULT) 58 - semweb:rdf_db (SEGFAULT) 59 - semweb:subprop (SEGFAULT)
With swi-prolog 8.3.28, the "26 - swipl:thread (SEGFAULT)" test also fails, and the test phase never finishes.