conda-forge / pymc3-feedstock

A conda-smithy repository for pymc3.
BSD 3-Clause "New" or "Revised" License
5 stars 17 forks source link

Run the test suite #70

Closed maresb closed 2 years ago

maresb commented 3 years ago

Checklist

conda-forge-linter commented 3 years ago

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

maresb commented 3 years ago

@conda-forge-admin, please rerender

github-actions[bot] commented 3 years ago

Hi! This is the friendly automated conda-forge-webservice. I tried to rerender for you, but it looks like there was nothing to do.

maresb commented 3 years ago

@twiecki please feel free to commit directly yourself. I enabled edits by maintainers.

maresb commented 3 years ago

Some broken pip dependencies:

https://dev.azure.com/conda-forge/feedstock-builds/_build/results?buildId=363667&view=logs&j=9c5ef928-2cd6-52e5-dbe6-9d173a7d951b&t=20c71c51-4b27-578b-485d-06ade2de1d00&l=1271

And under Windows, it's showing a stack overflow. :persevere:

Still waiting for more stuff to complete.

michaelosthege commented 3 years ago

And under Windows, it's showing a stack overflow. 😣

The test suite is too large to run in one test job on Windows. It's also reproducible locally that none of these tests fail individually, but give the stack overflow when all of them run in the same pytest call.

conda-forge-linter commented 3 years ago

Hi! This is the friendly automated conda-forge-linting service.

I was trying to look for recipes to lint for you, but it appears we have a merge conflict. Please try to merge or rebase with the base branch to resolve this conflict.

Please ping the 'conda-forge/core' team (using the @ notation in a comment) if you believe this is a bug.

conda-forge-linter commented 3 years ago

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

twiecki commented 3 years ago

I think we should figure out what adding numba is doing and solving the underlying issue, because it really doesn't make any sense to add it here.

maresb commented 3 years ago

@twiecki I completely agree, I just want to reach a state where the CI tests are passing.

Now it's saying pytest failed, even though there were no errors and only warnings? Any idea what's going on?

maresb commented 3 years ago

I am getting closer, thanks very much @michaelosthege for the suggestion to run the tests individually to avoid the stack overflow under Windows.

Now just Windows is failing with some theano compilation errors:

3.7:

E Exception: ('Compilation failed (return status=1): C:\Users\VSSADM\~1\AppData\Local\Temp\ccbQJyYb.s: Assembler messages:\r. C:\Users\VSSADM\~1\AppData\Local\Temp\ccbQJyYb.s:262: Error: invalid register for .seh_savexmm\r. ', 'FunctionGraph(Elemwise{ge,no_inplace}(v, <TensorType(float64, (True,))>))')

3.8

E Exception: ('Compilation failed (return status=1): C:\Users\VSSADM\~1\AppData\Local\Temp\cckOnRXa.s: Assembler messages:\r. C:\Users\VSSADM\~1\AppData\Local\Temp\cckOnRXa.s:81: Error: invalid register for .seh_savexmm\r. ', 'FunctionGraph(Elemwise{ge,no_inplace}(v, <TensorType(float64, (True,))>))')

3.9

E Exception: ("Compilation failed (return status=1): C:\Users\VSSADM\~1\AppData\Local\Temp\cctE7GCa.o: In function run':\r. C:/Users/VssAdministrator/AppData/Local/Theano/compiledir_Windows-10-10.0.14393-SP0-Intel64_Family_6_Model_79_Stepping_1_GenuineIntel-3.9.6-64/tmpv8iwac39/mod.cpp:1612: undefined reference todgemm'\r. C:\Users\VSSADM\~1\AppData\Local\Temp\cctE7GCa.o:mod.cpp:(.rdata$.refptr.dgemm[.refptr.dgemm]+0x0): undefined reference to `dgemm'\r. collect2.exe: error: ld returned 1 exit status\r. ", 'FunctionGraph(BatchedDot(<TensorType(float64, row)>, <TensorType(float64, row)>))')

Do we need to mess with cxxflags? Or is this a BLAS thing? Any ideas @twiecki or @michaelosthege?

twiecki commented 3 years ago

This problems was mentioned here: https://github.com/pymc-devs/pymc3/issues/4749 (seems to be a m2g64-toolchain issue, can you confirm that it gets installed?), as well as here: https://github.com/Theano/Theano/issues/6693 (solved by a compiler flag).

Perhaps the latter is the safer fix?

maresb commented 3 years ago

In the testing environment, m2w64-toolchain is indeed installed here.

I'm not actually sure how one should set compiler flags from conda. Primarily because I don't work with C++ much, so it's quite mysterious to me. And secondly because it seems very risky and complicated. For example, if we mess with the CXXFLAGS environment variable, what if it conflicts with other programs? I'm pretty stuck on a good way forward. Any suggestions?

twiecki commented 3 years ago

The compiler flag I think we would set in aesara or pymc3, we already do this here: https://github.com/pymc-devs/pymc3/blob/main/pymc3/__init__.py#L40 Could we test this here?

Curious if @michaelosthege has any ideas.

michaelosthege commented 3 years ago

The compiler flag I think we would set in aesara or pymc3, we already do this here: https://github.com/pymc-devs/pymc3/blob/main/pymc3/__init__.py#L40 Could we test this here?

Curious if @michaelosthege has any ideas.

I'm a biologist. This is pretty much black magic to me. My knowledge about compiler flags ends at the point where I know they exist and they do stuff. With the how the compiler does things I guess.

I think we should figure out what adding numba is doing and solving the underlying issue, because it really doesn't make any sense to add it here.

Maybe check out this for comparison? https://github.com/conda-forge/numba-feedstock/blob/master/recipe/meta.yaml

maresb commented 3 years ago

@conda-forge-admin, please rerender

maresb commented 3 years ago

@conda-forge-admin, please rerender

maresb commented 3 years ago

@conda-forge-admin, please rerender

maresb commented 3 years ago

@conda-forge-admin, please rerender

github-actions[bot] commented 3 years ago

Hi! This is the friendly automated conda-forge-webservice. I tried to rerender for you, but it looks like there was nothing to do.

maresb commented 3 years ago

@conda-forge-admin, please rerender

maresb commented 3 years ago

I rebased on the new version.

CI seems to be down. I'll probably have to try again tomorrow.

twiecki commented 3 years ago

restarted the tests (I think).

maresb commented 3 years ago

@conda-forge-admin, please rerender

maresb commented 3 years ago

In Python 3.7 there is a missing DLL when loading netcdf4.

In 3.8 and 3.9 there is an error about the following failed assertion in check_logp().

        domains = paramdomains.copy()
        domains["value"] = domain
        for pt in product(domains, n_samples=n_samples):
            pt = Point(pt, model=model)
>           assert_almost_equal(
                logp(pt),
                logp_reference(pt),
                decimal=decimal,
                err_msg=str(pt),
            )
E           AssertionError: 
E           Arrays are not almost equal to 6 decimals
E           {'alpha': array(20.), 'beta': array(0.9), 'value': array(100.)}
E           Mismatched elements: 1 / 1 (100%)
E           Max absolute difference: 3.86856262e+25
E           Max relative difference: 4.70326902e-16
E            x: array(-8.225263e+40)
E            y: array(-8.225263e+40)

pymc3\tests\test_distributions.py:611: AssertionError

It looks like numpy's definition of being equal to 6 digits means "6 digits after the 1's place" instead of "6 significant digits".

twiecki commented 3 years ago

@maresb I merged the mkl-services PR. Should we revisit this?

maresb commented 3 years ago

@twiecki, sure! I just rebased on master, so hopefully some of the latest changes have fixed some of the issues.

I don't have any time to work on this, but please feel free to commit to this branch.

Also please see the commented-out changes. For instance, we may need to pin the semver package. Unfortunately I've fairly completely forgotten where I was at.

maresb commented 3 years ago

@conda-forge-admin, please rerender

maresb commented 3 years ago

Windows is segfaulting right away on tests\test_pickling.py even though it's running as an individual test. I can move that one to the end and see if the other tests make any more progress.

OSX/3.7 stalled out mysteriously. Maybe it was a glitch.

I'll rebase master and try again...

maresb commented 3 years ago

@conda-forge-admin, please rerender

github-actions[bot] commented 3 years ago

Hi! This is the friendly automated conda-forge-webservice. I tried to rerender for you, but it looks like there was nothing to do.

maresb commented 3 years ago

Windows tests are still segfaulting. :( @michaelosthege, do you have any ideas?

michaelosthege commented 3 years ago

@maresb the tests segfault on Windows when you run too much at once. Where "too much" is something like less than a quarter of all tests. Looks like the CI here is trying to run everything at once..

maresb commented 3 years ago

@michaelosthege, on precisely this advice from you, I already split the tests so that each test file runs individually. :wink: Unfortunately it's still segfaulting on Windows, so I was hoping that you might have another clever idea.

michaelosthege commented 3 years ago

@maresb oh I see. In one log it shows this:

conda activate log ``` %SRC_DIR%>SET DISTUTILS_USE_SDK=1 %SRC_DIR%>SET MSSdk=1 %SRC_DIR%>if not exist "C:\Program Files (x86)\Microsoft Visual Studio\2017\Enterprise\" (set "VSINSTALLDIR=C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\" ) %SRC_DIR%>if not exist "C:\Program Files (x86)\Microsoft Visual Studio\2017\Enterprise\" (set "VSINSTALLDIR=C:\Program Files (x86)\Microsoft Visual Studio\2017\BuildTools\" ) %SRC_DIR%>if not exist "C:\Program Files (x86)\Microsoft Visual Studio\2017\Enterprise\" (set "VSINSTALLDIR=C:\Program Files (x86)\Microsoft Visual Studio\2017\Enterprise\" ) %SRC_DIR%>IF NOT "1" == "" ( set "INCLUDE=%PREFIX%\Library\include;" set "LIB=%PREFIX%\Library\lib;" set "CMAKE_PREFIX_PATH=%PREFIX%\Library;" ) %SRC_DIR%>call :GetWin10SdkDir %SRC_DIR%>call :GetWin10SdkDirHelper HKLM\SOFTWARE\Wow6432Node 1>nul 2>&1 %SRC_DIR%>if errorlevel 1 call :GetWin10SdkDirHelper HKCU\SOFTWARE\Wow6432Node 1>nul 2>&1 %SRC_DIR%>if errorlevel 1 call :GetWin10SdkDirHelper HKLM\SOFTWARE 1>nul 2>&1 %SRC_DIR%>if errorlevel 1 call :GetWin10SdkDirHelper HKCU\SOFTWARE 1>nul 2>&1 %SRC_DIR%>if errorlevel 1 exit /B 1 %SRC_DIR%>exit /B 0 %SRC_DIR%>for /F %i in ('dir /ON /B "C:\Program Files (x86)\Windows Kits\10\\include\10.*"') DO (SET WindowsSDKVer=%~i ) %SRC_DIR%>(SET WindowsSDKVer=10.0.10240.0 ) %SRC_DIR%>(SET WindowsSDKVer=10.0.10586.0 ) %SRC_DIR%>(SET WindowsSDKVer=10.0.14393.0 ) %SRC_DIR%>(SET WindowsSDKVer=10.0.15063.0 ) %SRC_DIR%>(SET WindowsSDKVer=10.0.16299.0 ) %SRC_DIR%>(SET WindowsSDKVer=10.0.17134.0 ) %SRC_DIR%>(SET WindowsSDKVer=10.0.17763.0 ) %SRC_DIR%>if errorlevel 1 (echo "Didn't find any windows 10 SDK. I'm not sure if things will work, but let's try..." ) else (echo Windows SDK version found as: "10.0.17763.0" ) Windows SDK version found as: "10.0.17763.0" %SRC_DIR%>IF "win-64" == "win-64" ( set "CMAKE_GEN=Visual Studio 15 2017 Win64" set "BITS=64" ) else ( set "CMAKE_GEN=Visual Studio 15 2017" set "BITS=32" ) %SRC_DIR%>pushd C:\Program Files (x86)\Microsoft Visual Studio\2017\Enterprise\ ```

When I see this on my PC I trash the environment right away and start over. It may work, but this is definitely not normal. Could be unrelated though.

Do you know which test causes the segfault? I couldn't see that from the logs.. Also maybe there's still some notion of "process session" and the split-up didn't help. The CI on our main repo splits things across jobs that run more separate than the tests here(?)

maresb commented 2 years ago

This is now mostly obsolete, so I'm closing this out.

It could however be a useful reference in the future if we want to run pytest in pymc-feedstock.