mldbai / mldb

MLDB is the Machine Learning Database
http://mldb.ai
Apache License 2.0
665 stars 102 forks source link

M1 mac support #940

Closed jeremybarnes closed 3 years ago

jeremybarnes commented 3 years ago

M1 Mac support.

This integrates the osx, asan, tsan and arm64 work to allow MLDB to run (and tests to pass) on an M1 mac under Darwin. The sanitizers (address and thread in particular) have proven invaluable for identifying the source of behavioral differences on the M1 machine, which is much more speculative and exploits the weaker arm64 memory model much more aggressively than the Neoverse chips in the Gravitron instances (which is more aggressive than Intel-based chips).

In particular, this branch allows:

Changes include:

There is still some ongoing work tracking down numerical variability and ensuring that http connections are correctly cleaned up; this branch can be considered a RFC for now.

FinchPowers commented 3 years ago

Would you like me to checkout the branch, build and run the tests?

jeremybarnes commented 3 years ago

Yes, please. Having osx as a development platform will make it easier to develop, but we need to be sure that it actually works and not just on my mac :)

FinchPowers commented 3 years ago

Same error as in the osx branch.

mldb/jml-build/clang.mk:13: building with clang++ version 12.0.0
mldb/jml-build/python.mk:7: PYTHON_VERSION_DETECTED=3.9
mldb/testing//testing.mk:40: mldb/testing/mldb_sample_plugin//mldb_sample_plugin.mk: No such file or directory
make: *** No rule to make target 'mldb/testing/mldb_sample_plugin//mldb_sample_plugin.mk'.  Stop.
jeremybarnes commented 3 years ago

Ah okay. That's I think a problem with the build instructions and submodules (need to do git submodule update --init --recursive). I think I need to give the build instructions a twice-over.

FinchPowers commented 3 years ago

HAHA trivial mistake. 🤦 I forgot the old tricks.

FinchPowers commented 3 years ago
mldb/jml-build/clang.mk:13: building with clang++ version 12.0.0
mldb/jml-build/python.mk:7: PYTHON_VERSION_DETECTED=3.9
           [C++  27kl  1,2M]            mldb/arch/wakeup_fd.cc
./mldb/arch/wakeup_fd.cc:114:35: error: use of undeclared identifier 'errno'
            throw MLDB::Exception(errno, "pipe() for wakeupfd");
                                  ^
./mldb/arch/wakeup_fd.cc:126:35: error: use of undeclared identifier 'errno'
            throw MLDB::Exception(errno, "fcntl() for wakeupfd read end");
                                  ^
./mldb/arch/wakeup_fd.cc:129:35: error: use of undeclared identifier 'errno'
            throw MLDB::Exception(errno, "fcntl() for wakeupfd write end");
                                  ^
./mldb/arch/wakeup_fd.cc:156:35: error: use of undeclared identifier 'errno'
            throw MLDB::Exception(errno, "wakeupfd signal write");
                                  ^
./mldb/arch/wakeup_fd.cc:171:26: error: use of undeclared identifier 'errno'
        if (res == -1 && errno == EWOULDBLOCK)
                         ^
./mldb/arch/wakeup_fd.cc:171:35: error: use of undeclared identifier 'EWOULDBLOCK'
        if (res == -1 && errno == EWOULDBLOCK)
                                  ^
./mldb/arch/wakeup_fd.cc:174:35: error: use of undeclared identifier 'errno'
            throw MLDB::Exception(errno, "wakeupfd signal write");
                                  ^
./mldb/arch/wakeup_fd.cc:187:21: error: use of undeclared identifier 'errno'
                if (errno == EINTR)
                    ^
./mldb/arch/wakeup_fd.cc:187:30: error: use of undeclared identifier 'EINTR'
                if (errno == EINTR)
                             ^
./mldb/arch/wakeup_fd.cc:189:21: error: use of undeclared identifier 'errno'
                if (errno == EAGAIN)
                    ^
./mldb/arch/wakeup_fd.cc:189:30: error: use of undeclared identifier 'EAGAIN'
                if (errno == EAGAIN)
                             ^
./mldb/arch/wakeup_fd.cc:191:39: error: use of undeclared identifier 'errno'
                throw MLDB::Exception(errno, "wakeupfd signal read");
                                      ^
./mldb/arch/wakeup_fd.cc:211:21: error: use of undeclared identifier 'errno'
                if (errno == EINTR)
                    ^
./mldb/arch/wakeup_fd.cc:211:30: error: use of undeclared identifier 'EINTR'
                if (errno == EINTR)
                             ^
./mldb/arch/wakeup_fd.cc:213:21: error: use of undeclared identifier 'errno'
                if (errno == EAGAIN)
                    ^
./mldb/arch/wakeup_fd.cc:213:30: error: use of undeclared identifier 'EAGAIN'
                if (errno == EAGAIN)
                             ^
./mldb/arch/wakeup_fd.cc:215:39: error: use of undeclared identifier 'errno'
                throw MLDB::Exception(errno, "wakeupfd signal read");
                                      ^
17 errors generated.
make: *** [mldb/arch//arch.mk:43: build/x86_64/obj/mldb/arch/wakeup_fd.cc.e48aabbc45379ccdce687d4fdad37fda.lo] Error 1
jeremybarnes commented 3 years ago

@FinchPowers what is the output of uname -a on your machine?

FinchPowers commented 3 years ago

Darwin MYULD13C32F9 19.6.0 Darwin Kernel Version 19.6.0: Tue Jun 22 19:49:55 PDT 2021; root:xnu-6153.141.35~1/RELEASE_X86_64 x86_64

FinchPowers commented 3 years ago

./mldb/ext/libarchive/libarchive/archive_read_support_format_rar5.c:51:10: fatal error: 'blake2.h' file not found

jeremybarnes commented 3 years ago

I have a machine with an older version of OSX lying around. I'll try it there. My main concern is around being able to test it with CI so it doesn't become stale.

jeremybarnes commented 3 years ago

@FinchPowers Updated the build instructions (and tested them); the brew install was missing libb2. Can you try again, referring to the new instructions?

FinchPowers commented 3 years ago
./mldb/ext/s2geometry/src/s2/s2region_coverer.cc:364:23: error: loop variable 'id' of type 'const S2CellId' creates a copy from
      type 'const S2CellId' [-Werror,-Wrange-loop-analysis]
  for (const S2CellId id : covering) {
                      ^
./mldb/ext/s2geometry/src/s2/s2region_coverer.cc:364:8: note: use reference type 'const S2CellId &' to prevent copying
  for (const S2CellId id : covering) {
       ^~~~~~~~~~~~~~~~~~~
jeremybarnes commented 3 years ago

what is your clang++ --version?

FinchPowers commented 3 years ago

building with clang++ version 12.0.0

jeremybarnes commented 3 years ago

Try again? Maybe with -k so we're not playing whack-a-mole here :)

FinchPowers commented 3 years ago

Build successfull! Test wise, a few core dumps. A few failures are also related to network security blocking connections to localhost.

mldb/jml-build/python.mk:7: PYTHON_VERSION_DETECTED=3.9
      [TESTCASE]                        info_test
      [TESTCASE]                        http_client_test
                 info_test FAILED
Boost.Test WARNING: token "build/x86_64/tests/info_test" does not correspond to the Boost.Test argument
                    and should be placed after all Boost.Test arguments and the -- separator.
                    For example: info_test --random -- build/x86_64/tests/info_test
Running 4 test cases...
num_cpus = 1
unknown location:0: fatal error: in "test_fqdn_hostname": MLDB::Exception: nodename nor servname provided, or not known: getaddrinfo: %s: Undefined error: 0
./mldb/arch/testing/info_test.cc:71: last checkpoint: "test_fqdn_hostname" test entry

*** 1 failure is detected in the test module "Master Test Suite"
                       info_test FAILED
make: *** [mldb/arch/testing//arch_testing.mk:11: build/x86_64/tests/info_test.passed] Error 1
      [MLDBTEST]                        MLDBFB-530_continuous_window_fails_on_restart.py
      [TESTCASE]                        python_converters_test
      [TESTCASE]                        python_cell_converter_test
      [MLDBTEST]                        MLDB-1359_procedure_latest_run.py
      [MLDBTEST]                        MLDB-1491-get-all-not-implemented-for-datasets.js
      [MLDBTEST]                        MLDB-1869_json_payload_test.py
                 MLDBFB-530_continuous_window_fails_on_restart.py FAILED
. virtualenv/bin/activate; PYTHONPATH=build/x86_64/bin build/x86_64/bin/mldb_runner -h localhost -p '11700-12700'  --run-script mldb/testing/MLDBFB-530_continuous_window_fails_on_restart.py --mute-final-output --config-path mldb/container_files/mldb.conf --watchdog-timeout=120
reading configuration from file: 'mldb/container_files/mldb.conf'

MLDB ready

creating SYMLINK /var/folders/s_/2q8683j92032cp63gh76jw8h0000gn/T/Aw1ZQb/main.py -> /Users/francois-mi.lheureux/workspace/mldb/mldb/testing/MLDBFB-530_continuous_window_fails_on_restart.py
loading from: /var/folders/s_/2q8683j92032cp63gh76jw8h0000gn/T/Aw1ZQb/main.py
2021-08-24 15:53:59.251 loader
{
        "context" : [ "Running python script" ],
        "lineNumber" : 6,
        "message" : "No module named 'dateutil'",
        "scriptUri" : "file://mldb/testing/MLDBFB-530_continuous_window_fails_on_restart.py",
        "stack" :
        [

                {
                        "functionName" : "<module>",
                        "lineNumber" : 6,
                        "scriptUri" : "file://mldb/testing/MLDBFB-530_continuous_window_fails_on_restart.py",
                        "where" : "File \"file://mldb/testing/MLDBFB-530_continuous_window_fails_on_restart.py\", line 6, in <module>"
                }
        ],
        "type" : "ModuleNotFoundError",
        "where" : "File \"file://mldb/testing/MLDBFB-530_continuous_window_fails_on_restart.py\", line 6, in <module>"
}

exception in accept: Operation canceled
exception in accept: Operation canceled
ServicePeer [2021-08-24T15:53:59.252-4:00] warning WARNING: peer mldb lost its own entry in discovery.  Letting it come back
peer mldb connection to mldb changed state to 3
peer mldb connection to mldb changed state to 3
                       MLDBFB-530_continuous_window_fails_on_restart.py FAILED
make: *** [mldb/testing//testing.mk:94: build/x86_64/tests/MLDBFB-530_continuous_window_fails_on_restart.py.passed] Error 1
                 MLDB-1359_procedure_latest_run.py FAILED
. virtualenv/bin/activate; PYTHONPATH=build/x86_64/bin build/x86_64/bin/mldb_runner -h localhost -p '11700-12700'  --run-script mldb/testing/MLDB-1359_procedure_latest_run.py --mute-final-output --config-path mldb/container_files/mldb.conf --watchdog-timeout=120
reading configuration from file: 'mldb/container_files/mldb.conf'

MLDB ready

creating SYMLINK /var/folders/s_/2q8683j92032cp63gh76jw8h0000gn/T/teyvCi/main.py -> /Users/francois-mi.lheureux/workspace/mldb/mldb/testing/MLDB-1359_procedure_latest_run.py
loading from: /var/folders/s_/2q8683j92032cp63gh76jw8h0000gn/T/teyvCi/main.py
2021-08-24 15:53:59.380 loader
{
        "context" : [ "Running python script" ],
        "lineNumber" : 7,
        "message" : "No module named 'dateutil'",
        "scriptUri" : "file://mldb/testing/MLDB-1359_procedure_latest_run.py",
        "stack" :
        [

                {
                        "functionName" : "<module>",
                        "lineNumber" : 7,
                        "scriptUri" : "file://mldb/testing/MLDB-1359_procedure_latest_run.py",
                        "where" : "File \"file://mldb/testing/MLDB-1359_procedure_latest_run.py\", line 7, in <module>"
                }
        ],
        "type" : "ModuleNotFoundError",
        "where" : "File \"file://mldb/testing/MLDB-1359_procedure_latest_run.py\", line 7, in <module>"
}

exception in accept: Operation canceled
ServicePeer [2021-08-24T15:53:59.381-4:00] warning WARNING: peer mldb lost its own entry in discovery.  Letting it come back
peer mldb connection to mldb changed state to 3
peer mldb connection to mldb changed state to 3
                       MLDB-1359_procedure_latest_run.py FAILED
make: *** [mldb/testing//testing.mk:387: build/x86_64/tests/MLDB-1359_procedure_latest_run.py.passed] Error 1
                 MLDB-1869_json_payload_test.py FAILED
. virtualenv/bin/activate; PYTHONPATH=build/x86_64/bin build/x86_64/bin/mldb_runner -h localhost -p '11700-12700'  --run-script mldb/testing/MLDB-1869_json_payload_test.py --mute-final-output --config-path mldb/container_files/mldb.conf --watchdog-timeout=120
reading configuration from file: 'mldb/container_files/mldb.conf'

MLDB ready

creating SYMLINK /var/folders/s_/2q8683j92032cp63gh76jw8h0000gn/T/55lRGd/main.py -> /Users/francois-mi.lheureux/workspace/mldb/mldb/testing/MLDB-1869_json_payload_test.py
loading from: /var/folders/s_/2q8683j92032cp63gh76jw8h0000gn/T/55lRGd/main.py
2021-08-24 15:53:59.646 loader
{
        "context" : [ "Running python script" ],
        "lineNumber" : 6,
        "message" : "No module named 'requests'",
        "scriptUri" : "file://mldb/testing/MLDB-1869_json_payload_test.py",
        "stack" :
        [

                {
                        "functionName" : "<module>",
                        "lineNumber" : 6,
                        "scriptUri" : "file://mldb/testing/MLDB-1869_json_payload_test.py",
                        "where" : "File \"file://mldb/testing/MLDB-1869_json_payload_test.py\", line 6, in <module>"
                }
        ],
        "type" : "ModuleNotFoundError",
        "where" : "File \"file://mldb/testing/MLDB-1869_json_payload_test.py\", line 6, in <module>"
}

exception in accept: Operation canceled
exception in accept: Operation canceled
ServicePeer [2021-08-24T15:53:59.646-4:00] warning WARNING: peer mldb lost its own entry in discovery.  Letting it come back
peer mldb connection to mldb changed state to 3
peer mldb connection to mldb changed state to 3
                       MLDB-1869_json_payload_test.py FAILED
make: *** [mldb/testing//testing.mk:491: build/x86_64/tests/MLDB-1869_json_payload_test.py.passed] Error 1
/bin/bash: line 1: 28374 Segmentation fault: 11  PYTHONPATH=build/x86_64/bin:build/x86_64/bin:build/x86_64/bin virtualenv/bin/python mldb/testing/python_cell_converter_test.py > build/x86_64/tests/python_cell_converter_test.running 2>&1
                 python_cell_converter_test FAILED
make: *** [mldb/testing//testing.mk:356: build/x86_64/tests/python_cell_converter_test.passed] Error 1
/bin/bash: line 1: 28377 Segmentation fault: 11  PYTHONPATH=build/x86_64/bin:build/x86_64/bin:build/x86_64/bin virtualenv/bin/python mldb/testing/python_converters_test.py > build/x86_64/tests/python_converters_test.running 2>&1
                 python_converters_test FAILED
make: *** [mldb/testing//testing.mk:352: build/x86_64/tests/python_converters_test.passed] Error 1
                 MLDB-1491-get-all-not-implemented-for-datasets.js FAILED
 build/x86_64/bin/mldb_runner -h localhost -p '11700-12700'  --run-script mldb/testing/MLDB-1491-get-all-not-implemented-for-datasets.js --mute-final-output --config-path mldb/container_files/mldb.conf --watchdog-timeout=120
reading configuration from file: 'mldb/container_files/mldb.conf'

MLDB ready

context.resources = ["/v1","/types","/plugins","/javascript","javascript","/routes/run","run"]
creating SYMLINK /var/folders/s_/2q8683j92032cp63gh76jw8h0000gn/T/Xj4wHN/main.js -> /Users/francois-mi.lheureux/workspace/mldb/mldb/testing/MLDB-1491-get-all-not-implemented-for-datasets.js
loading from: /var/folders/s_/2q8683j92032cp63gh76jw8h0000gn/T/Xj4wHN/main.js
GitImporter [2021-08-24T15:53:59.571-4:00] info processing 3476 commits
GitImporter [2021-08-24T15:53:59.702-4:00] info imported commit 100 of 3476
GitImporter [2021-08-24T15:53:59.799-4:00] info imported commit 200 of 3476
GitImporter [2021-08-24T15:54:00.051-4:00] info imported commit 300 of 3476
GitImporter [2021-08-24T15:54:00.215-4:00] info imported commit 400 of 3476
GitImporter [2021-08-24T15:54:00.329-4:00] info imported commit 500 of 3476
GitImporter [2021-08-24T15:54:00.410-4:00] info imported commit 600 of 3476
GitImporter [2021-08-24T15:54:00.475-4:00] info imported commit 700 of 3476
GitImporter [2021-08-24T15:54:00.524-4:00] info imported commit 800 of 3476
GitImporter [2021-08-24T15:54:00.583-4:00] info imported commit 900 of 3476
GitImporter [2021-08-24T15:54:00.661-4:00] info imported commit 1000 of 3476
GitImporter [2021-08-24T15:54:00.739-4:00] info imported commit 1100 of 3476
GitImporter [2021-08-24T15:54:00.821-4:00] info imported commit 1200 of 3476
GitImporter [2021-08-24T15:54:00.920-4:00] info imported commit 1300 of 3476
GitImporter [2021-08-24T15:54:00.992-4:00] info imported commit 1400 of 3476
GitImporter [2021-08-24T15:54:01.086-4:00] info imported commit 1500 of 3476
GitImporter [2021-08-24T15:54:01.164-4:00] info imported commit 1600 of 3476
GitImporter [2021-08-24T15:54:01.251-4:00] info imported commit 1700 of 3476
GitImporter [2021-08-24T15:54:01.304-4:00] info imported commit 1800 of 3476
GitImporter [2021-08-24T15:54:01.351-4:00] info imported commit 1900 of 3476
GitImporter [2021-08-24T15:54:01.416-4:00] info imported commit 2000 of 3476
GitImporter [2021-08-24T15:54:01.479-4:00] info imported commit 2100 of 3476
GitImporter [2021-08-24T15:54:01.535-4:00] info imported commit 2200 of 3476
GitImporter [2021-08-24T15:54:01.586-4:00] info imported commit 2300 of 3476
GitImporter [2021-08-24T15:54:01.637-4:00] info imported commit 2400 of 3476
GitImporter [2021-08-24T15:54:01.692-4:00] info imported commit 2500 of 3476
GitImporter [2021-08-24T15:54:01.756-4:00] info imported commit 2600 of 3476
GitImporter [2021-08-24T15:54:01.811-4:00] info imported commit 2700 of 3476
GitImporter [2021-08-24T15:54:01.862-4:00] info imported commit 2800 of 3476
GitImporter [2021-08-24T15:54:01.909-4:00] info imported commit 2900 of 3476
GitImporter [2021-08-24T15:54:01.956-4:00] info imported commit 3000 of 3476
GitImporter [2021-08-24T15:54:01.998-4:00] info imported commit 3100 of 3476
GitImporter [2021-08-24T15:54:02.048-4:00] info imported commit 3200 of 3476
GitImporter [2021-08-24T15:54:02.090-4:00] info imported commit 3300 of 3476
GitImporter [2021-08-24T15:54:02.135-4:00] info imported commit 3400 of 3476
                       MLDB-1491-get-all-not-implemented-for-datasets.js FAILED
make: *** [mldb/testing//testing.mk:396: build/x86_64/tests/MLDB-1491-get-all-not-implemented-for-datasets.js.passed] Error 1
                 [   4,0s  0,0G  0,2c ] http_client_test passed
make: Target 'default' not remade because of errors.
jeremybarnes commented 3 years ago

@FinchPowers For the module import errors, there are instructions to pip install a few things (requests, python-dateutil, bottle) which caused issues when installed in the virtualenv but not when installed system-wide.

I've removed the function which fails the test, which is not actually used anywhere :(

As for the others, they are a little bit of a worry. I recall something similar happening with them at some stage, but I don't remember what it was. Can you re-run those tests (make -j6 -k failed_tests) and see if there is still a problem?

FinchPowers commented 3 years ago

Python: I do not have a virtualenv. I do have pyenv though to manage many versions. I installed the dependencies you mentioned.

mldb/jml-build/clang.mk:13: building with clang++ version 12.0.0
mldb/jml-build/python.mk:7: PYTHON_VERSION_DETECTED=3.9
      [MLDBTEST]                        MLDB-1359_procedure_latest_run.py
      [MLDBTEST]                        MLDB-1869_json_payload_test.py
      [MLDBTEST]                        MLDBFB-530_continuous_window_fails_on_restart.py
      [TESTCASE]                        python_cell_converter_test
      [TESTCASE]                        python_converters_test
                 MLDB-1359_procedure_latest_run.py FAILED
                 MLDB-1869_json_payload_test.py FAILED
. virtualenv/bin/activate; PYTHONPATH=build/x86_64/bin build/x86_64/bin/mldb_runner -h localhost -p '11700-12700'  --run-script mldb/testing/MLDB-1359_procedure_latest_run.py --mute-final-output --config-path mldb/container_files/mldb.conf --watchdog-timeout=120
reading configuration from file: 'mldb/container_files/mldb.conf'

MLDB ready

creating SYMLINK /var/folders/s_/2q8683j92032cp63gh76jw8h0000gn/T/aZDJZ3/main.py -> /Users/francois-mi.lheureux/workspace/mldb/mldb/testing/MLDB-1359_procedure_latest_run.py
loading from: /var/folders/s_/2q8683j92032cp63gh76jw8h0000gn/T/aZDJZ3/main.py
2021-08-24 16:33:12.373 loader
{
        "context" : [ "Running python script" ],
        "lineNumber" : 7,
        "message" : "No module named 'dateutil'",
        "scriptUri" : "file://mldb/testing/MLDB-1359_procedure_latest_run.py",
        "stack" :
        [

                {
                        "functionName" : "<module>",
                        "lineNumber" : 7,
                        "scriptUri" : "file://mldb/testing/MLDB-1359_procedure_latest_run.py",
                        "where" : "File \"file://mldb/testing/MLDB-1359_procedure_latest_run.py\", line 7, in <module>"
                }
        ],
        "type" : "ModuleNotFoundError",
        "where" : "File \"file://mldb/testing/MLDB-1359_procedure_latest_run.py\", line 7, in <module>"
}

exception in accept: Operation canceled
ServicePeer [2021-08-24T16:33:12.373-4:00] warning WARNING: peer mldb lost its own entry in discovery.  Letting it come back
peer mldb connection to mldb changed state to 3
peer mldb connection to mldb changed state to 3
                       MLDB-1359_procedure_latest_run.py FAILED
. virtualenv/bin/activate; PYTHONPATH=build/x86_64/bin build/x86_64/bin/mldb_runner -h localhost -p '11700-12700'  --run-script mldb/testing/MLDB-1869_json_payload_test.py --mute-final-output --config-path mldb/container_files/mldb.conf --watchdog-timeout=120
reading configuration from file: 'mldb/container_files/mldb.conf'

MLDB ready

creating SYMLINK /var/folders/s_/2q8683j92032cp63gh76jw8h0000gn/T/cjkWuc/main.py -> /Users/francois-mi.lheureux/workspace/mldb/mldb/testing/MLDB-1869_json_payload_test.py
loading from: /var/folders/s_/2q8683j92032cp63gh76jw8h0000gn/T/cjkWuc/main.py
2021-08-24 16:33:12.373 loader
{
        "context" : [ "Running python script" ],
        "lineNumber" : 6,
        "message" : "No module named 'requests'",
        "scriptUri" : "file://mldb/testing/MLDB-1869_json_payload_test.py",
        "stack" :
        [

                {
                        "functionName" : "<module>",
                        "lineNumber" : 6,
                        "scriptUri" : "file://mldb/testing/MLDB-1869_json_payload_test.py",
                        "where" : "File \"file://mldb/testing/MLDB-1869_json_payload_test.py\", line 6, in <module>"
                }
        ],
        "type" : "ModuleNotFoundError",
        "where" : "File \"file://mldb/testing/MLDB-1869_json_payload_test.py\", line 6, in <module>"
}

exception in accept: Operation canceled
ServicePeer [2021-08-24T16:33:12.373-4:00] warning WARNING: peer mldb lost its own entry in discovery.  Letting it come back
peer mldb connection to mldb changed state to 3
peer mldb connection to mldb changed state to 3
                       MLDB-1869_json_payload_test.py FAILED
make: *** [mldb/testing//testing.mk:387: build/x86_64/tests/MLDB-1359_procedure_latest_run.py.passed] Error 1
make: *** [mldb/testing//testing.mk:491: build/x86_64/tests/MLDB-1869_json_payload_test.py.passed] Error 1
                 MLDBFB-530_continuous_window_fails_on_restart.py FAILED
. virtualenv/bin/activate; PYTHONPATH=build/x86_64/bin build/x86_64/bin/mldb_runner -h localhost -p '11700-12700'  --run-script mldb/testing/MLDBFB-530_continuous_window_fails_on_restart.py --mute-final-output --config-path mldb/container_files/mldb.conf --watchdog-timeout=120
reading configuration from file: 'mldb/container_files/mldb.conf'

MLDB ready

creating SYMLINK /var/folders/s_/2q8683j92032cp63gh76jw8h0000gn/T/YgbNIr/main.py -> /Users/francois-mi.lheureux/workspace/mldb/mldb/testing/MLDBFB-530_continuous_window_fails_on_restart.py
loading from: /var/folders/s_/2q8683j92032cp63gh76jw8h0000gn/T/YgbNIr/main.py
2021-08-24 16:33:12.415 loader
{
        "context" : [ "Running python script" ],
        "lineNumber" : 6,
        "message" : "No module named 'dateutil'",
        "scriptUri" : "file://mldb/testing/MLDBFB-530_continuous_window_fails_on_restart.py",
        "stack" :
        [

                {
                        "functionName" : "<module>",
                        "lineNumber" : 6,
                        "scriptUri" : "file://mldb/testing/MLDBFB-530_continuous_window_fails_on_restart.py",
                        "where" : "File \"file://mldb/testing/MLDBFB-530_continuous_window_fails_on_restart.py\", line 6, in <module>"
                }
        ],
        "type" : "ModuleNotFoundError",
        "where" : "File \"file://mldb/testing/MLDBFB-530_continuous_window_fails_on_restart.py\", line 6, in <module>"
}

exception in accept: Operation canceled
exception in accept: Operation canceled
ServicePeer [2021-08-24T16:33:12.416-4:00] warning WARNING: peer mldb lost its own entry in discovery.  Letting it come back
peer mldb connection to mldb changed state to 3
peer mldb connection to mldb changed state to 3
                       MLDBFB-530_continuous_window_fails_on_restart.py FAILED
make: *** [mldb/testing//testing.mk:94: build/x86_64/tests/MLDBFB-530_continuous_window_fails_on_restart.py.passed] Error 1
/bin/bash: line 1: 39792 Segmentation fault: 11  PYTHONPATH=build/x86_64/bin:build/x86_64/bin:build/x86_64/bin virtualenv/bin/python mldb/testing/python_cell_converter_test.py > build/x86_64/tests/python_cell_converter_test.running 2>&1
                 python_cell_converter_test FAILED
make: *** [mldb/testing//testing.mk:356: build/x86_64/tests/python_cell_converter_test.passed] Error 1
/bin/bash: line 1: 39795 Segmentation fault: 11  PYTHONPATH=build/x86_64/bin:build/x86_64/bin:build/x86_64/bin virtualenv/bin/python mldb/testing/python_converters_test.py > build/x86_64/tests/python_converters_test.running 2>&1
                 python_converters_test FAILED
make: *** [mldb/testing//testing.mk:352: build/x86_64/tests/python_converters_test.passed] Error 1
make: Target 'failed_tests' not remade because of errors.
jeremybarnes commented 3 years ago

MLDB manages its own virtualenv, which it switches in and out of as some tests must run in it (and some must not). I think it would be quite a lot of work to move from virtualenv to pyenv or something else, as it's tightly wound into the build system. From what I can see, you didn't pip install system-wide the packages mentioned; maybe you can install them in the virtualenv of MLDB and it would work then?

jeremybarnes commented 3 years ago

It's not a great developer experience if a modern way of managing one's environment is incompatible with building MLDB though...

FinchPowers commented 3 years ago

As far as pyenv is concerned, I did install them system wide. I also take care to install python versions in pyenv with dev dependencies so other things can link to those python versions.

I already try to manually install the deps in the mldb virtualenv, but they are already there.

francois-mi.lheureux@MYULD13C32F9:~/workspace/mldb$ pyenv version
3.9.6 (set by /Users/francois-mi.lheureux/workspace/mldb/.python-version)
francois-mi.lheureux@MYULD13C32F9:~/workspace/mldb$ pip install requests python-dateutil bottle
Requirement already satisfied: requests in /Users/francois-mi.lheureux/.pyenv/versions/3.9.6/lib/python3.9/site-packages (2.26.0)
Requirement already satisfied: python-dateutil in /Users/francois-mi.lheureux/.pyenv/versions/3.9.6/lib/python3.9/site-packages (2.8.2)
Requirement already satisfied: bottle in /Users/francois-mi.lheureux/.pyenv/versions/3.9.6/lib/python3.9/site-packages (0.12.19)
Requirement already satisfied: charset-normalizer~=2.0.0 in /Users/francois-mi.lheureux/.pyenv/versions/3.9.6/lib/python3.9/site-packages (from requests) (2.0.4)
Requirement already satisfied: certifi>=2017.4.17 in /Users/francois-mi.lheureux/.pyenv/versions/3.9.6/lib/python3.9/site-packages (from requests) (2021.5.30)
Requirement already satisfied: idna<4,>=2.5 in /Users/francois-mi.lheureux/.pyenv/versions/3.9.6/lib/python3.9/site-packages (from requests) (3.2)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /Users/francois-mi.lheureux/.pyenv/versions/3.9.6/lib/python3.9/site-packages (from requests) (1.26.6)
Requirement already satisfied: six>=1.5 in /Users/francois-mi.lheureux/.pyenv/versions/3.9.6/lib/python3.9/site-packages (from python-dateutil) (1.16.0)

francois-mi.lheureux@MYULD13C32F9:~/workspace/mldb$ . virtualenv/bin/activate
(virtualenv) francois-mi.lheureux@MYULD13C32F9:~/workspace/mldb$ pip install requests python-dateutil bottle
Requirement already satisfied: requests in ./virtualenv/lib/python3.9/site-packages (2.25.1)
Requirement already satisfied: python-dateutil in ./virtualenv/lib/python3.9/site-packages (2.8.1)
Requirement already satisfied: bottle in ./virtualenv/lib/python3.9/site-packages (0.12.19)
Requirement already satisfied: chardet<5,>=3.0.2 in ./virtualenv/lib/python3.9/site-packages (from requests) (3.0.4)
Requirement already satisfied: idna<3,>=2.5 in ./virtualenv/lib/python3.9/site-packages (from requests) (2.8)
Requirement already satisfied: certifi>=2017.4.17 in ./virtualenv/lib/python3.9/site-packages (from requests) (2019.11.28)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in ./virtualenv/lib/python3.9/site-packages (from requests) (1.25.8)
Requirement already satisfied: six>=1.5 in ./virtualenv/lib/python3.9/site-packages (from python-dateutil) (1.14.0)
WARNING: You are using pip version 21.2.3; however, version 21.2.4 is available.
You should consider upgrading via the '/Users/francois-mi.lheureux/workspace/mldb/virtualenv/bin/python -m pip install --upgrade pip' command.
FinchPowers commented 3 years ago

I have python 3.9 w/o pyenv too so I tried with that "system" version instead. Here is the new output

francois-mi.lheureux@MYULD13C32F9:~/workspace/mldb$ make -j6 -k failed_tests
mldb/jml-build/clang.mk:13: building with clang++ version 12.0.0
mldb/jml-build/python.mk:7: PYTHON_VERSION_DETECTED=3.9
      [MLDBTEST]                        MLDB-1359_procedure_latest_run.py
      [MLDBTEST]                        MLDB-1869_json_payload_test.py
      [MLDBTEST]                        MLDBFB-530_continuous_window_fails_on_restart.py
      [TESTCASE]                        python_cell_converter_test
      [TESTCASE]                        python_converters_test
                 [   0,0s  0,0G  0,5c ] MLDB-1359_procedure_latest_run.py passed
                 [   0,0s  0,0G  0,6c ] MLDBFB-530_continuous_window_fails_on_restart.py passed
                 [   0,0s  0,0G  0,7c ] MLDB-1869_json_payload_test.py passed
/bin/bash: line 1: 44603 Segmentation fault: 11  PYTHONPATH=build/x86_64/bin:build/x86_64/bin:build/x86_64/bin virtualenv/bin/python mldb/testing/python_cell_converter_test.py > build/x86_64/tests/python_cell_converter_test.running 2>&1
                 python_cell_converter_test FAILED
make: *** [mldb/testing//testing.mk:356: build/x86_64/tests/python_cell_converter_test.passed] Error 1
/bin/bash: line 1: 44606 Segmentation fault: 11  PYTHONPATH=build/x86_64/bin:build/x86_64/bin:build/x86_64/bin virtualenv/bin/python mldb/testing/python_converters_test.py > build/x86_64/tests/python_converters_test.running 2>&1
                 python_converters_test FAILED
make: *** [mldb/testing//testing.mk:352: build/x86_64/tests/python_converters_test.passed] Error 1
make: Target 'failed_tests' not remade because of errors.
jeremybarnes commented 3 years ago

Okay, so down to two mysterious segfaults :). Can you run

PYTHONPATH=build/x86_64/bin:build/x86_64/bin:build/x86_64/bin lldb virtualenv/bin/python mldb/testing/python_cell_converter_test.py

and then type run to run the program? Once it crashes, you can type bt to get the backtrace.

FinchPowers commented 3 years ago
(lldb) target create "virtualenv/bin/python"
Current executable set to '/Users/francois-mi.lheureux/workspace/mldb/virtualenv/bin/python' (x86_64).
(lldb) settings set -- target.run-args  "mldb/testing/python_cell_converter_test.py"
(lldb) run
Process 47270 launched: '/Users/francois-mi.lheureux/workspace/mldb/virtualenv/bin/python' (x86_64)
Process 47270 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x10)
    frame #0: 0x00000001031dba1d Python`_PyObject_GC_Alloc + 39
Python`_PyObject_GC_Alloc:
->  0x1031dba1d <+39>: movq   0x10(%r14), %r15
    0x1031dba21 <+43>: addq   $0x10, %rdx
    0x1031dba25 <+47>: testl  %edi, %edi
    0x1031dba27 <+49>: jne    0x1031dbb19               ; <+291>
Target 0: (python) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x10)
  * frame #0: 0x00000001031dba1d Python`_PyObject_GC_Alloc + 39
    frame #1: 0x00000001030d3239 Python`PyType_Ready + 3546
    frame #2: 0x00000001030d4944 Python`PyType_Ready + 9445
    frame #3: 0x0000000101d15d1e libpython_interpreter.f83f47b29f7634085fbea1a8c7dfc279.dylib`std::__1::__function::__func<MLDB::(anonymous namespace)::$_0, std::__1::allocator<MLDB::(anonymous namespace)::$_0>, void (MLDB::EnterThreadToken const&)>::operator()(MLDB::EnterThreadToken const&) [inlined] PyInit_streamcapture at capture_stream.cc:121:9
    frame #4: 0x0000000101d15d04 libpython_interpreter.f83f47b29f7634085fbea1a8c7dfc279.dylib`std::__1::__function::__func<MLDB::(anonymous namespace)::$_0, std::__1::allocator<MLDB::(anonymous namespace)::$_0>, void (MLDB::EnterThreadToken const&)>::operator()(MLDB::EnterThreadToken const&) [inlined] auto MLDB::(anonymous namespace)::$_0::operator(this=<unavailable>, thr=<unavailable>)<MLDB::EnterThreadToken const>(MLDB::EnterThreadToken const&) const at capture_stream.cc:137
    frame #5: 0x0000000101d15d04 libpython_interpreter.f83f47b29f7634085fbea1a8c7dfc279.dylib`std::__1::__function::__func<MLDB::(anonymous namespace)::$_0, std::__1::allocator<MLDB::(anonymous namespace)::$_0>, void (MLDB::EnterThreadToken const&)>::operator()(MLDB::EnterThreadToken const&) [inlined] decltype(__f=<unavailable>, __args=<unavailable>)::$_0&>(fp)(std::__1::forward<MLDB::EnterThreadToken const&>(fp0))) std::__1::__invoke<MLDB::(anonymous namespace)::$_0&, MLDB::EnterThreadToken const&>(MLDB::(anonymous namespace)::$_0&, MLDB::EnterThreadToken const&) at type_traits:3545
    frame #6: 0x0000000101d15d04 libpython_interpreter.f83f47b29f7634085fbea1a8c7dfc279.dylib`std::__1::__function::__func<MLDB::(anonymous namespace)::$_0, std::__1::allocator<MLDB::(anonymous namespace)::$_0>, void (MLDB::EnterThreadToken const&)>::operator()(MLDB::EnterThreadToken const&) [inlined] void std::__1::__invoke_void_return_wrapper<void>::__call<MLDB::(anonymous namespace)::$_0&, MLDB::EnterThreadToken const&>(__args=<unavailable>, __args=<unavailable>)::$_0&, MLDB::EnterThreadToken const&) at __functional_base:348
    frame #7: 0x0000000101d15d04 libpython_interpreter.f83f47b29f7634085fbea1a8c7dfc279.dylib`std::__1::__function::__func<MLDB::(anonymous namespace)::$_0, std::__1::allocator<MLDB::(anonymous namespace)::$_0>, void (MLDB::EnterThreadToken const&)>::operator()(MLDB::EnterThreadToken const&) [inlined] std::__1::__function::__alloc_func<MLDB::(anonymous namespace)::$_0, std::__1::allocator<MLDB::(anonymous namespace)::$_0>, void (MLDB::EnterThreadToken const&)>::operator(this=<unavailable>, __arg=<unavailable>)(MLDB::EnterThreadToken const&) at functional:1546
    frame #8: 0x0000000101d15d04 libpython_interpreter.f83f47b29f7634085fbea1a8c7dfc279.dylib`std::__1::__function::__func<MLDB::(anonymous namespace)::$_0, std::__1::allocator<MLDB::(anonymous namespace)::$_0>, void (MLDB::EnterThreadToken const&)>::operator(this=<unavailable>, __arg=<unavailable>)(MLDB::EnterThreadToken const&) at functional:1720
    frame #9: 0x0000000101d13151 libpython_interpreter.f83f47b29f7634085fbea1a8c7dfc279.dylib`MLDB::(anonymous namespace)::runPythonInitializers(MLDB::EnterThreadToken const&) [inlined] std::__1::__function::__value_func<void (MLDB::EnterThreadToken const&)>::operator(__args=0x0000000000000000)(MLDB::EnterThreadToken const&) const at functional:1873:16
    frame #10: 0x0000000101d13143 libpython_interpreter.f83f47b29f7634085fbea1a8c7dfc279.dylib`MLDB::(anonymous namespace)::runPythonInitializers(MLDB::EnterThreadToken const&) [inlined] std::__1::function<void (MLDB::EnterThreadToken const&)>::operator(this=<unavailable>, __arg=0x0000000000000000)(MLDB::EnterThreadToken const&) const at functional:2548
    frame #11: 0x0000000101d13136 libpython_interpreter.f83f47b29f7634085fbea1a8c7dfc279.dylib`MLDB::(anonymous namespace)::runPythonInitializers(thread=0x0000000000000000) at python_interpreter.cc:359
    frame #12: 0x0000000101d1198f libpython_interpreter.f83f47b29f7634085fbea1a8c7dfc279.dylib`MLDB::PythonInterpreter::mainInterpreter() at python_interpreter.cc:453:9
    frame #13: 0x0000000101d12eb6 libpython_interpreter.f83f47b29f7634085fbea1a8c7dfc279.dylib`MLDB::PythonInterpreter::initializeFromModuleInit() at python_interpreter.cc:419:5
    frame #14: 0x0000000101c81ae5 py_cell_conv_test_module.so`init_module_py_cell_conv_test_module() at python_cell_converter_test_support.cc:57:5
    frame #15: 0x0000000101cbf7f5 libboost_python39.dylib`boost::python::handle_exception_impl(boost::function0<void>) + 69
    frame #16: 0x0000000101cc04c9 libboost_python39.dylib`bool boost::python::handle_exception<void (*)()>(void (*)()) + 57
    frame #17: 0x0000000101cc03b8 libboost_python39.dylib`boost::python::detail::init_module(PyModuleDef&, void (*)()) + 72
    frame #18: 0x00000001002723c5 libpython3.9.dylib`_PyImport_LoadDynamicModuleWithSpec + 613
    frame #19: 0x0000000100271c74 libpython3.9.dylib`_imp_create_dynamic + 308
    frame #20: 0x000000010019e597 libpython3.9.dylib`cfunction_vectorcall_FASTCALL + 215
    frame #21: 0x0000000100243e80 libpython3.9.dylib`_PyEval_EvalFrameDefault + 28832
    frame #22: 0x0000000100247824 libpython3.9.dylib`_PyEval_EvalCode + 2852
    frame #23: 0x000000010015e1f0 libpython3.9.dylib`_PyFunction_Vectorcall + 256
    frame #24: 0x000000010024695b libpython3.9.dylib`call_function + 411
    frame #25: 0x00000001002439e6 libpython3.9.dylib`_PyEval_EvalFrameDefault + 27654
    frame #26: 0x000000010015e2e5 libpython3.9.dylib`function_code_fastcall + 229
    frame #27: 0x000000010024695b libpython3.9.dylib`call_function + 411
    frame #28: 0x00000001002439c9 libpython3.9.dylib`_PyEval_EvalFrameDefault + 27625
    frame #29: 0x000000010015e2e5 libpython3.9.dylib`function_code_fastcall + 229
    frame #30: 0x000000010024695b libpython3.9.dylib`call_function + 411
    frame #31: 0x0000000100243a90 libpython3.9.dylib`_PyEval_EvalFrameDefault + 27824
    frame #32: 0x000000010015e2e5 libpython3.9.dylib`function_code_fastcall + 229
    frame #33: 0x000000010024695b libpython3.9.dylib`call_function + 411
    frame #34: 0x0000000100243a90 libpython3.9.dylib`_PyEval_EvalFrameDefault + 27824
    frame #35: 0x000000010015e2e5 libpython3.9.dylib`function_code_fastcall + 229
    frame #36: 0x000000010024695b libpython3.9.dylib`call_function + 411
    frame #37: 0x0000000100243a90 libpython3.9.dylib`_PyEval_EvalFrameDefault + 27824
    frame #38: 0x000000010015e2e5 libpython3.9.dylib`function_code_fastcall + 229
    frame #39: 0x000000010015f709 libpython3.9.dylib`object_vacall + 489
    frame #40: 0x000000010015f976 libpython3.9.dylib`_PyObject_CallMethodIdObjArgs + 246
    frame #41: 0x0000000100270b06 libpython3.9.dylib`PyImport_ImportModuleLevelObject + 1350
    frame #42: 0x00000001002422f4 libpython3.9.dylib`_PyEval_EvalFrameDefault + 21780
    frame #43: 0x0000000100247824 libpython3.9.dylib`_PyEval_EvalCode + 2852
    frame #44: 0x000000010023cd10 libpython3.9.dylib`PyEval_EvalCode + 64
    frame #45: 0x000000010028d42d libpython3.9.dylib`pyrun_file + 333
    frame #46: 0x000000010028b4e9 libpython3.9.dylib`PyRun_SimpleFileExFlags + 729
    frame #47: 0x00000001002aa4b3 libpython3.9.dylib`Py_RunMain + 2067
    frame #48: 0x00000001002aa9e3 libpython3.9.dylib`pymain_main + 403
    frame #49: 0x00000001002aaa3b libpython3.9.dylib`Py_BytesMain + 43
    frame #50: 0x00007fff691e7cc9 libdyld.dylib`start + 1
jeremybarnes commented 3 years ago

I'm guessing that it's an issue caused by loading MLDB as a shared library from Python, which is changing the order of initialization of the static members (it looks like we're using a structure before it's initialized). @FinchPowers can you try with the latest commit?

FinchPowers commented 3 years ago
(lldb) target create "virtualenv/bin/python"
Current executable set to '/Users/francois-mi.lheureux/workspace/mldb/virtualenv/bin/python' (x86_64).
(lldb) settings set -- target.run-args  "mldb/testing/python_cell_converter_test.py"
(lldb) run
Process 28769 launched: '/Users/francois-mi.lheureux/workspace/mldb/virtualenv/bin/python' (x86_64)
Process 28769 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x10)
    frame #0: 0x00000001041dba1d Python`_PyObject_GC_Alloc + 39
Python`_PyObject_GC_Alloc:
->  0x1041dba1d <+39>: movq   0x10(%r14), %r15
    0x1041dba21 <+43>: addq   $0x10, %rdx
    0x1041dba25 <+47>: testl  %edi, %edi
    0x1041dba27 <+49>: jne    0x1041dbb19               ; <+291>
Target 0: (python) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x10)
  * frame #0: 0x00000001041dba1d Python`_PyObject_GC_Alloc + 39
    frame #1: 0x00000001040d3239 Python`PyType_Ready + 3546
    frame #2: 0x00000001040d4944 Python`PyType_Ready + 9445
    frame #3: 0x0000000102715d03 libpython_interpreter.f83f47b29f7634085fbea1a8c7dfc279.dylib`std::__1::__function::__func<MLDB::(anonymous namespace)::$_0, std::__1::allocator<MLDB::(anonymous namespace)::$_0>, void (MLDB::EnterThreadToken const&)>::operator()(MLDB::EnterThreadToken const&) [inlined] PyInit_streamcapture at capture_stream.cc:126:9
    frame #4: 0x0000000102715ce4 libpython_interpreter.f83f47b29f7634085fbea1a8c7dfc279.dylib`std::__1::__function::__func<MLDB::(anonymous namespace)::$_0, std::__1::allocator<MLDB::(anonymous namespace)::$_0>, void (MLDB::EnterThreadToken const&)>::operator()(MLDB::EnterThreadToken const&) [inlined] auto MLDB::(anonymous namespace)::$_0::operator(this=<unavailable>, thr=<unavailable>)<MLDB::EnterThreadToken const>(MLDB::EnterThreadToken const&) const at capture_stream.cc:145
    frame #5: 0x0000000102715ce4 libpython_interpreter.f83f47b29f7634085fbea1a8c7dfc279.dylib`std::__1::__function::__func<MLDB::(anonymous namespace)::$_0, std::__1::allocator<MLDB::(anonymous namespace)::$_0>, void (MLDB::EnterThreadToken const&)>::operator()(MLDB::EnterThreadToken const&) [inlined] decltype(__f=<unavailable>, __args=<unavailable>)::$_0&>(fp)(std::__1::forward<MLDB::EnterThreadToken const&>(fp0))) std::__1::__invoke<MLDB::(anonymous namespace)::$_0&, MLDB::EnterThreadToken const&>(MLDB::(anonymous namespace)::$_0&, MLDB::EnterThreadToken const&) at type_traits:3545
    frame #6: 0x0000000102715ce4 libpython_interpreter.f83f47b29f7634085fbea1a8c7dfc279.dylib`std::__1::__function::__func<MLDB::(anonymous namespace)::$_0, std::__1::allocator<MLDB::(anonymous namespace)::$_0>, void (MLDB::EnterThreadToken const&)>::operator()(MLDB::EnterThreadToken const&) [inlined] void std::__1::__invoke_void_return_wrapper<void>::__call<MLDB::(anonymous namespace)::$_0&, MLDB::EnterThreadToken const&>(__args=<unavailable>, __args=<unavailable>)::$_0&, MLDB::EnterThreadToken const&) at __functional_base:348
    frame #7: 0x0000000102715ce4 libpython_interpreter.f83f47b29f7634085fbea1a8c7dfc279.dylib`std::__1::__function::__func<MLDB::(anonymous namespace)::$_0, std::__1::allocator<MLDB::(anonymous namespace)::$_0>, void (MLDB::EnterThreadToken const&)>::operator()(MLDB::EnterThreadToken const&) [inlined] std::__1::__function::__alloc_func<MLDB::(anonymous namespace)::$_0, std::__1::allocator<MLDB::(anonymous namespace)::$_0>, void (MLDB::EnterThreadToken const&)>::operator(this=<unavailable>, __arg=<unavailable>)(MLDB::EnterThreadToken const&) at functional:1546
    frame #8: 0x0000000102715ce4 libpython_interpreter.f83f47b29f7634085fbea1a8c7dfc279.dylib`std::__1::__function::__func<MLDB::(anonymous namespace)::$_0, std::__1::allocator<MLDB::(anonymous namespace)::$_0>, void (MLDB::EnterThreadToken const&)>::operator(this=<unavailable>, __arg=<unavailable>)(MLDB::EnterThreadToken const&) at functional:1720
    frame #9: 0x0000000102713121 libpython_interpreter.f83f47b29f7634085fbea1a8c7dfc279.dylib`MLDB::(anonymous namespace)::runPythonInitializers(MLDB::EnterThreadToken const&) [inlined] std::__1::__function::__value_func<void (MLDB::EnterThreadToken const&)>::operator(__args=0x0000000000000000)(MLDB::EnterThreadToken const&) const at functional:1873:16
    frame #10: 0x0000000102713113 libpython_interpreter.f83f47b29f7634085fbea1a8c7dfc279.dylib`MLDB::(anonymous namespace)::runPythonInitializers(MLDB::EnterThreadToken const&) [inlined] std::__1::function<void (MLDB::EnterThreadToken const&)>::operator(this=<unavailable>, __arg=0x0000000000000000)(MLDB::EnterThreadToken const&) const at functional:2548
    frame #11: 0x0000000102713106 libpython_interpreter.f83f47b29f7634085fbea1a8c7dfc279.dylib`MLDB::(anonymous namespace)::runPythonInitializers(thread=0x0000000000000000) at python_interpreter.cc:359
    frame #12: 0x000000010271195f libpython_interpreter.f83f47b29f7634085fbea1a8c7dfc279.dylib`MLDB::PythonInterpreter::mainInterpreter() at python_interpreter.cc:453:9
    frame #13: 0x0000000102712e86 libpython_interpreter.f83f47b29f7634085fbea1a8c7dfc279.dylib`MLDB::PythonInterpreter::initializeFromModuleInit() at python_interpreter.cc:419:5
    frame #14: 0x0000000102681ae5 py_cell_conv_test_module.so`init_module_py_cell_conv_test_module() at python_cell_converter_test_support.cc:57:5
    frame #15: 0x00000001026bf7f5 libboost_python39.dylib`boost::python::handle_exception_impl(boost::function0<void>) + 69
    frame #16: 0x00000001026c04c9 libboost_python39.dylib`bool boost::python::handle_exception<void (*)()>(void (*)()) + 57
    frame #17: 0x00000001026c03b8 libboost_python39.dylib`boost::python::detail::init_module(PyModuleDef&, void (*)()) + 72
    frame #18: 0x00000001002723c5 libpython3.9.dylib`_PyImport_LoadDynamicModuleWithSpec + 613
    frame #19: 0x0000000100271c74 libpython3.9.dylib`_imp_create_dynamic + 308
    frame #20: 0x000000010019e597 libpython3.9.dylib`cfunction_vectorcall_FASTCALL + 215
    frame #21: 0x0000000100243e80 libpython3.9.dylib`_PyEval_EvalFrameDefault + 28832
    frame #22: 0x0000000100247824 libpython3.9.dylib`_PyEval_EvalCode + 2852
    frame #23: 0x000000010015e1f0 libpython3.9.dylib`_PyFunction_Vectorcall + 256
    frame #24: 0x000000010024695b libpython3.9.dylib`call_function + 411
    frame #25: 0x00000001002439e6 libpython3.9.dylib`_PyEval_EvalFrameDefault + 27654
    frame #26: 0x000000010015e2e5 libpython3.9.dylib`function_code_fastcall + 229
    frame #27: 0x000000010024695b libpython3.9.dylib`call_function + 411
    frame #28: 0x00000001002439c9 libpython3.9.dylib`_PyEval_EvalFrameDefault + 27625
    frame #29: 0x000000010015e2e5 libpython3.9.dylib`function_code_fastcall + 229
    frame #30: 0x000000010024695b libpython3.9.dylib`call_function + 411
    frame #31: 0x0000000100243a90 libpython3.9.dylib`_PyEval_EvalFrameDefault + 27824
    frame #32: 0x000000010015e2e5 libpython3.9.dylib`function_code_fastcall + 229
    frame #33: 0x000000010024695b libpython3.9.dylib`call_function + 411
    frame #34: 0x0000000100243a90 libpython3.9.dylib`_PyEval_EvalFrameDefault + 27824
    frame #35: 0x000000010015e2e5 libpython3.9.dylib`function_code_fastcall + 229
    frame #36: 0x000000010024695b libpython3.9.dylib`call_function + 411
    frame #37: 0x0000000100243a90 libpython3.9.dylib`_PyEval_EvalFrameDefault + 27824
    frame #38: 0x000000010015e2e5 libpython3.9.dylib`function_code_fastcall + 229
    frame #39: 0x000000010015f709 libpython3.9.dylib`object_vacall + 489
    frame #40: 0x000000010015f976 libpython3.9.dylib`_PyObject_CallMethodIdObjArgs + 246
    frame #41: 0x0000000100270b06 libpython3.9.dylib`PyImport_ImportModuleLevelObject + 1350
    frame #42: 0x00000001002422f4 libpython3.9.dylib`_PyEval_EvalFrameDefault + 21780
    frame #43: 0x0000000100247824 libpython3.9.dylib`_PyEval_EvalCode + 2852
    frame #44: 0x000000010023cd10 libpython3.9.dylib`PyEval_EvalCode + 64
    frame #45: 0x000000010028d42d libpython3.9.dylib`pyrun_file + 333
    frame #46: 0x000000010028b4e9 libpython3.9.dylib`PyRun_SimpleFileExFlags + 729
    frame #47: 0x00000001002aa4b3 libpython3.9.dylib`Py_RunMain + 2067
    frame #48: 0x00000001002aa9e3 libpython3.9.dylib`pymain_main + 403
    frame #49: 0x00000001002aaa3b libpython3.9.dylib`Py_BytesMain + 43
    frame #50: 0x00007fff691e7cc9 libdyld.dylib`start + 1
jeremybarnes commented 3 years ago

I'm a little bit stumped. This is happening when using MLDB from Python (so MLDB acts as a pure python module, there is no server or executable as the actual program running is the Python interpreter). But strangely enough, it's working everywhere apart from these two tests.

From what I can see, there is a null pointer dereference. Since the call to PyType_Ready is nested, we can see from the CPython source that it must be in initializing the base class. I really can't see what's going on to cause this.

(It works fine on the 3 OSX machines I've tried it on).

jeremybarnes commented 3 years ago

(Issue was added as #943)