SWI-Prolog / issues

Dummy repository for issue tracking
8 stars 3 forks source link

Compilation issues on Alpine Linux #102

Closed brebs-gh closed 3 years ago

brebs-gh commented 3 years ago

Below is an APKBUILD file, to package swi-prolog for Alpine Linux (which uses Musl instead of glibc). It has ".txt" appended to its filename, to upload it as a file here.

APKBUILD.txt

With swi-prolog 8.3.27 (and many previous versions), these tests fail:

53 - semweb:con (SEGFAULT) 58 - semweb:rdf_db (SEGFAULT) 59 - semweb:subprop (SEGFAULT)

With swi-prolog 8.3.28, the "26 - swipl:thread (SEGFAULT)" test also fails, and the test phase never finishes.

JanWielemaker commented 3 years ago

This issue has been mentioned on SWI-Prolog. There might be relevant details there:

https://swi-prolog.discourse.group/t/how-to-build-a-docker-image-with-swipl-and-jpl/4255/9

brebs-gh commented 3 years ago

Running with: USE_PUBLIC_NETWORK_TESTS=false ctest -j 16 --output-on-failure || true

shows:

` File "/home/brebs/apk/swipl/src/swipl-8.3.28/packages/language_server/python/test_prologserver.py", line 650, in test_server_options_and_shutdown self.assertEqual(afterShutdownThreads, initialThreads) AssertionError: Lists differ: ['mai[107 chars]er1_conn1_comm:running', 'language_server3_conn3_goal:running'] != ['mai[107 chars]er1_conn1_comm:running']

First list contains 1 additional elements. First extra element 5: 'language_server3_conn3_goal:running'

['main:running', 'gc:running', 'language_server1:running', 'language_server1_conn1_goal:running',

brebs-gh commented 3 years ago

Compiling without -DCMAKE_BUILD_TYPE=PGO stops the test phase from hanging. Still has the usual 3 test failures:

 54 - semweb:con (SEGFAULT)
 59 - semweb:rdf_db (SEGFAULT)
 60 - semweb:subprop (SEGFAULT)
JanWielemaker commented 3 years ago

The hang probably relates to https://swi-prolog.discourse.group/t/difficult-to-reproduce-problems-while-running-tests/4266/20?u=jan. We expect a proper patch for that soon. The others are unclear.

brebs-gh commented 3 years ago

With swi-prolog 8.3.29, PGO compilation combined with testing runs without hanging, with the 4 test failures as above:

46 - mqi:mqi (Failed) 54 - semweb:con (SEGFAULT) 59 - semweb:rdf_db (SEGFAULT) 60 - semweb:subprop (SEGFAULT)

The mqi test error is: AssertionError: Lists differ: ['mai[59 chars]running', 'mqi1_conn1_comm:running', 'mqi3_conn3_goal:running'] != ['mai[59 chars]running', 'mqi1_conn1_comm:running']

JanWielemaker commented 3 years ago

I'm trying to have a look using Docker. I think this is running Alpine 3.14 (how to verify). Your deps refer to openjdk15, while openjdk11 seems the latest here. The package ossp-uuid-dev lacks as well (and is needed by mqi). Am I missing something?

EricZinda commented 3 years ago

I'm investigating the language_server failure above. It is happening on the last line of code below and means that the goal thread for the connection created using Unix Domain sockets did not go away.

  1. When the test calls stop_language_server/1 the predicate eventually calls thread_signal(Thread_ID, abort) on the goal thread.
  2. The test then calls the Python thread_list to get a list of active threads using Prolog thread_property and compares them to the list before the test to see if any are hanging around that should have been aborted.

I believe there could be a race in this test where the goal thread either hasn't reacted to the abort yet, or thread_property hasn't been updated yet. Not entirely sure how the Prolog system works so I can't say for sure.

I added a couple of comments below that could test this assertion. If you put them into test_prologserver.py and uncomment the sleep(5) line you can see if that fixes it to test the race condition suspicion.

                    # unixDomainSocket() should be used if supplied (non-windows).
                    socketPath = mkdtemp()
                    unixDomainSocket = PrologServer.unix_domain_socket_file(socketPath)
                    result = monitorThread.query("language_server([unix_domain_socket('{}'), password(testpassword), server_thread(ServerThreadID)])".format(unixDomainSocket))
                    serverThreadID = result[0]["ServerThreadID"]
                    with PrologServer(launch_server=False, unix_domain_socket=unixDomainSocket, password="testpassword", prolog_path=self.prologPath) as newServer:
                        with newServer.create_thread() as prologThread:
                            result = prologThread.query("true")
                            self.assertEqual(result, True)
                    result = monitorThread.query("stop_language_server({})".format(serverThreadID))
                    self.assertEqual(result, True)
                    # Uncomment to see if it fixes it
                    # sleep(5)
                    afterShutdownThreads = self.thread_list(monitorThread)
                    self.assertEqual(afterShutdownThreads, initialThreads)
JanWielemaker commented 3 years ago

Running it verbose in the Alpine Docker indicates that, because ossp-uuid is lacking, uuid(UUID, [format(integer)]) fails. I extended the pure Prolog emulation of the UUID library and now the mqi test passes.

brebs-gh commented 3 years ago

To clarify - I run Alpine "Edge" (basically the bleeding-edge), which becomes the next version of Alpine. Kinda like e.g. Debian Unstable.

To show the OS version, run: cat /etc/os-release

ossp-uuid is only in Edge, in the testing repo (which is not enabled by default), so if you have removed its dependency then that's ideal :-)

brebs-gh commented 3 years ago

For the openjdk 15, please change it to 11, or basically the latest available. I don't think Alpine's package manager has the concept of the latest version. In Edge I've got openjdk 9 to 16 available, visible with e.g.:

apk search openjdk | grep src | sort

JanWielemaker commented 3 years ago

Pushed SWI-Prolog/swipl-devel@836862aebcef781f82f41f078851405cade7bd5c, which seems to work around the crashes. Before, pushed an update to library(uuid) to provide the UUID services mqi requires in pure Prolog, so it works if ossp-uuid is not around.

Now trying the whole build on our Ci environment. It seems to build well, but there is a small issue with the reporting, so the result pages do not update properly. Will look at that tomorrow.

JanWielemaker commented 3 years ago

Enjoy 100% success at https://dev.swi-prolog.org/ci/home. You find the dependencies and config at https://github.com/SWI-Prolog/docker-swipl-linux-ci/tree/master/alpine/3.14

JanWielemaker commented 3 years ago

This issue has been mentioned on SWI-Prolog. There might be relevant details there:

https://swi-prolog.discourse.group/t/how-to-build-a-docker-image-with-swipl-and-jpl/4255/10