Closed leofang closed 3 years ago
Hi! This is the friendly automated conda-forge-linting service.
I wanted to let you know that I linted all conda-recipes in your PR (recipe
) and found some lint.
Here's what I've got...
For recipe:
<two spaces>#<one space>[<expression>]
form. See lines [36]@conda-forge-admin, please rerender
Hi! This is the friendly automated conda-forge-linting service.
I just wanted to let you know that I linted all conda-recipes in your PR (recipe
) and found it was in an excellent condition.
@leofang It is not that easy. mpi4py is prepared to build with a proper MS-MPI install, and not the binaries from the conda-forge package.
I think the proper fix is to export the following environment variables before pip install ...
:
set MSMPI_INC=%BUILD_PREFIX%\Library\include
set MSMPI_LIB=%BUILD_PREFIX%\Library\lib
Edit: You may need to scape backslashes
@leofang Or maybe patch conf/mpidistutils.py
special-casing CONDA_BUILD
in the environment.
@dalcinl Thanks! It actually built fine, just failed when testing because python setup.py ...
would make it search the wrong Python lib location. I am trying 123ec3e as an alternative way to patch.
I think the proper fix is to export the following environment variables before
pip install ...
:
Yeah I think it's equivalent to 123ec3e!!! Let's give it a shot 🤞
@dalcinl You're right...But then I don't understand why it built in 11a8f39...😂
Hi! This is the friendly automated conda-forge-linting service.
I just wanted to let you know that I linted all conda-recipes in your PR (recipe
) and found it was in an excellent condition.
I do have some suggestions for making it better though...
For recipe:
Hi! This is the friendly automated conda-forge-linting service.
I just wanted to let you know that I linted all conda-recipes in your PR (recipe
) and found it was in an excellent condition.
@dalcinl CCP_HOME
works! 🎉 Let me clean things up.
cc: @RyanMcCarthy-NOAA Thank you for packaging msmpi
! It seems to work fine with mpi4py
. Would be great if you could give it a try 🙂
@dalcinl @conda-forge/core Question: Is it necessary to add msmpi
to mpi-feedstock
? It seems to me that the mutual exclusive lock is gained for free as msmpi
only lives on Windows, different from mpich
or openmpi
.
Yes, otherwise one day when there's openmpi or other impl, we will have to hotfix package which we want to avoid if possible.
Yes, otherwise one day when there's openmpi or other impl, we will have to hotfix package which we want to avoid if possible.
Thanks, @isuruf. I seriously think it's highly impossible, but I sent https://github.com/conda-forge/mpi-feedstock/pull/7 anyway.
We need to first merge https://github.com/conda-forge/mpi-feedstock/pull/7 -> https://github.com/conda-forge/msmpi-feedstock/pull/1 -> this PR.
@dalcinl Any chance you know what's wrong with MSMPI? The CI raises this error when running the test script added in f5ae831:
(%PREFIX%) %SRC_DIR%>mpiexec -n 2 python mpi4py_simple_test.py
ERROR: Failed to post close command error 1726
ERROR: unable to tear down the job tree. exiting...
@leofang Perhaps you need to start the smpd
daemon before running?
smpd -debug
I have no idea, really. I never used the new open-sourced MS-MPI, all my experience with it was using the binary installers downloaded from Microsoft. Unfortunately, the msmpi conda-forge package recipe do not have even a trivial test making sure the packaged binaries work for both linking and running MPI programs.
PS: There is no need to add yet another file with tests. Just use the following lines for testing:
mpiexec -n 2 python -m mpi4py.bench helloworld
mpiexec -n 2 python -m mpi4py.bench ringtest
Hi! This is the friendly automated conda-forge-linting service.
I wanted to let you know that I linted all conda-recipes in your PR (recipe
) and found some lint.
Here's what I've got...
For recipe:
meta.yaml
, though. To get a traceback to help figure out what's going on, install conda-smithy and run conda smithy recipe-lint .
from the recipe directory. Hi! This is the friendly automated conda-forge-linting service.
I just wanted to let you know that I linted all conda-recipes in your PR (recipe
) and found it was in an excellent condition.
@RyanMcCarthy-NOAA @isuruf I feel msmpi
isn't correctly built, or something wrong is picked up. See the following output generated by mpiexec -d 3 ...
(3 is the maximum debug level)
...
[00:5320] successfully launched process 1
[00:5320] successfully launched process 0
[00:5320] root process launched, starting stdin redirection.
[01:2068] ERROR: Process Management Protocol version mismatch in connection
request: received version 3, expected 4.
[01:2068] ERROR: Process Management Protocol version mismatch in connection
request: received version 3, expected 4.
[01:2068] process_id=1 rank=0 refcount=2, waiting for the process to finish exiting.
[01:2068] creating an exit command for process id=1 rank=0, pid=9128, exit code=-1.
...
SMPD_PMP_VERSION
is defined as 4 in the source code, yes, but for some reason 3 is picked up.
@RyanMcCarthy-NOAA were you able to run your applications using the msmpi
package?
@RyanMcCarthy-NOAA were you able to run your applications using the
msmpi
package?
I originally needed msmpi for win builds of taudem. I just created a new env with msmpi and taudem and ran one of the example commands from the docs:
mpiexec -n 8 -d 3 PitRemove logan.tif
and got the expected output:
....
[01:784] env: PMI_RANK=0
[01:784] env: PMI_SMPD_KEY=7
[01:784] version check complete, using PMP version 4.
[01:784] forwarding command SMPD_INIT src=1 ctx_key=0
....
It seems that msmpi is working properly for at least that package.
Thanks for checking, @mccarthyryanc.
version check complete, using PMP version 4
This was also shown in the failed CI build, before the error happened. I assumed by working you meant no error is generated and the result is correct.
Could you do one more check for me? Which mpiexec was in use?
where mpiexec
Another possibility is it seems msmpi is also included in the docker image (C:\Program Files\Microsoft MPI), which may have an incompatible version.
This was also shown in the failed CI build, before the error happened. I assumed by working you meant no error is generated and the result is correct.
Yes, command completed successfully and output data was correct.
mpiexec
was in the local env I created at <env-base>\Library\bin\mpiexec.exe
. Checkout the conda build docs about the LIBRARY_BIN env variable.
mpiexec
was in the local env I created at<env-base>\Library\bin\mpiexec.exe
. Checkout the conda build docs about the LIBRARY_BIN env variable.
Thanks for confirming @mccarthyryanc. Yes I am aware of LIBRARY_BIN
. I am in the middle of confirming my theory, as I found the Windows docker image has an outdated MPI that might be causing the bad interference:
...
Directory of C:\Program Files\Microsoft MPI\Bin
12/10/2020 05:15 PM <DIR> .
12/10/2020 05:15 PM <DIR> ..
06/13/2016 12:38 PM 284,368 mpiexec.exe
06/13/2016 12:25 PM 989,201 mpitrace.man
06/13/2016 12:37 PM 27,760 msmpilaunchsvc.exe
06/13/2016 12:38 PM 237,776 smpd.exe
4 File(s) 1,539,105 bytes
2 Dir(s) 84,183,883,776 bytes free
Microsoft MPI Startup Program [Version 7.1.12437.25] <---
...
@dalcinl @mccarthyryanc
I am in the middle of confirming my theory, as I found the Windows docker image has an outdated MPI that might be causing the bad interference
My theory was wrong. It didn't help after cleanup. In fact, Conda/MS-MPI did pick up the correct executables. The error when running mpiexec -n 2 python ...
seems to be a bad interference with Azure's infrastructure. I was able to download the build artifact (which was enabled by adding this config
azure:
store_build_artifacts: true
to conda-forge.yml
) and verified on a Windows virtual machine that the resulting package is working, so what I'm going to do is to clean up the recipe and disable testing on Windows. Then this PR will be ready to go (note that all import tests also pass on Windows, so it should be enough).
By the way,
Perhaps you need to start the
smpd
daemon before running?smpd -debug
It turns out not necessary. When launching mpiexec
it will spin up a smpd
manager. In fact, the manually launched smpd
(via start /B smpd -d 3
) is ignored by mpiexec
. This aligns with mpi4py's tests: there we don't worry about smpd
either.
We need to first merge conda-forge/mpi-feedstock#7 -> conda-forge/msmpi-feedstock#1 -> this PR.
@dalcinl Both PRs are in and all checks have passed! This is ready now 😁
@leofang Any chance you can fixup all commits into a single one (or two, too keep recipe rerenders separate)? Or should we just "Squash and merge"? I usually care about having a clean history without dirty intermediate steps.
For feedstocks I care much less because the commit history is already dirty (due to bots, migrations, etc). Feel free to squash if you like 🙂
For feedstocks I care much less because the commit history is already dirty (due to bots, migrations, etc)
Yes, but most of the recipe contentes and all of the code in build scripts is not touched by bots and migrations. OK, I'll squash and merge.
Thanks, @dalcinl. I noticed the merge commit has 1 connection failure: https://dev.azure.com/conda-forge/feedstock-builds/_build/results?buildId=260246&view=logs&j=e5cdccbf-4751-5a24-7406-185c9d30d021 Are you able to restart that test? Without all tests passing it seems CF won't add me to the maintainer list...
It's restarted!
btw Forgot to say: Thank you @dalcinl @mccarthyryanc @isuruf for your help with my little Windows adventure! 😂
Checklist
0
(if the version changed)conda-smithy
(Use the phrase code>@<space/conda-forge-admin, please rerender in a comment in this PR for automated rerendering)I know nothing on Windows and does not use it at all, just curious how far we can go with the new
msmpi
feedstock.