sourceryinstitute / OpenCoarrays

A parallel application binary interface for Fortran 2018 compilers.
http://www.opencoarrays.org
BSD 3-Clause "New" or "Revised" License
247 stars 56 forks source link

Windows support #755

Closed zbeekman closed 2 years ago

zbeekman commented 2 years ago
coverage on master
Codecov branch

Summary of changes

Rationale for changes

These changes were made to support the ability to build and run OpenCoarrays and therefore coarray Fortran applications on Windows and to generally improve the robustness of the build system. WSL and other heavyweight *nix compatibility solutions are often unpalatable or unacceptable to some users and organizations. Requiring the now freely available Intel MPI implementation (part of Intel's OneAPI set of developer tools) is acceptable because 1) It's the only sufficiently modern and up-to-date MPI solution that I am aware that runs natively on windows, 2) it is now available free of cost and 3) many HPC computing centers, government agencies, etc. already have support contracts with Intel and use the Intel compilers/OneAPI. In addition to Intel OneAPI, the GFortran and GCC native windows compilers are required along with CMake and unix make/gmake, and a bash shell like the one that comes with git-bash for windows. Binaries produced with the OpenCoarrays caf wrapper script should be redistributable without all these other requirements--only requiring the intel MPI runtime--assuming that the user doesn't intentionally try to create dynamically linked executables.

ToDo

Additional info and certifications

This pull request (PR) is a:

I certify that

Code coverage data

coverage on master

zbeekman commented 2 years ago

@everythingfunctional thanks so much for the test drive and review!!! I'm impressed and happy that you got so far without any adequate documentation. This is my first priority and on the top of my todo list and I requirement before I change this PR from draft status.

I used a Windows Command Prompt. I first ran the setvars.bat script provided by Intel to setup the environment. I created opencoarrays-build and opencoarrays-install folders. From the opencoarrays-build folder, I used cmake-gui ../OpenCoarrays to set the Generator to "Unix Makefiles", the compilers to those installed by the Equation.com gcc installer, and the install folder to opencoarrays-install. I was then able to, from the same command prompt, run make and the make test to compile and run the test suite. A few tests failed, specifically register_alloc_comp_1,2 and 3, test-installation-scripts.sh, caf, and cafrun. It also popped up a lot of prompts from Windows Defender, but they did not reappear after having accepted them all and running the suite again. The failure of the last 3 tests is probably expected, since a Windows command prompt doesn't understand how to execute a shell script. make install then copied the expected things over to opencoarrays-install. From a git-bash terminal I can then use the provided caf and cafrun scripts to compile and execute a "Hello, World", although each image segfaults on program shutdown. Using fpm with caf and cafrun does not work however, as it uses execute_command_line, which executes in a cmd environment, which can't execute the scripts.

These are known issues, some/most of which are documented in the todo list in the main PR comment/description

In all, I think this is a huge leap forward, but I would ask for at least one additional thing to be included in this PR. Update the README.md and/or INSTALL.md to provide better instructions for building and using on Windows. A future PR for use of caf and cafrun in a native Windows environment would be a good idea, but it's not super urgent.

I completely agree. Lack of documentation is the number one blocker of this PR and an absolute requirement before merging. That's why it's at the top of the ToDo list. 😉

Thanks again for the feedback!

I've pasted the segfault during shutdown info from a debug build below in the event that you or @vehre have any insight into what's happening.

$ cafrun -np 2 ./hello_caf.exe
 Hello from            2  of            2 !
 Hello from            1  of            2 !

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0  0xadf38fdb
#1  0xadf2ef04
#2  0xadf0fef1
#3  0x78797ff7
#4  0x78ab20ce
#5  0x78a61453
#6  0x78ab0bfd
#7  0xadf27185
#8  0xadee3471
#9  0xadee34e7
#10  0xadee1683
#11  0xadee13c0
#12  0xadee14f5
#13  0x784f7033
#14  0x78a62650
#15  0xffffffff

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0  0xadf38fdb
#1  0xadf2ef04
#2  0xadf0fef1
#3  0x78797ff7
#4  0x78ab20ce
#5  0x78a61453
#6  0x78ab0bfd
#7  0xadf27185
#8  0xadee3471
#9  0xadee34e7
#10  0xadee1683
#11  0xadee13c0
#12  0xadee14f5
#13  0x784f7033
#14  0x78a62650
#15  0xffffffff
1/2: _gfortran_caf_init(957) The mpi memory model is: unified  (0x2, 1).
1/2: finalize_internal(972) (status_code = 0)
1/2: finalize_internal(1026) In barrier for finalize...1/2: finalize_internal(1059) Freed all slave tokens.
2/2: finalize_internal(972) (status_code = 0)
2/2: finalize_internal(1026) In barrier for finalize...2/2: finalize_internal(1059) Freed all slave tokens.
Error: Command:
   `C:/Program Files (x86)/Intel/oneAPI/mpi/latest/bin/mpiexec.exe -np 2 ./hello_caf.exe`
failed to run.
zbeekman commented 2 years ago

@everythingfunctional @rouson let me know if the updated build instructions in INSTALL.md make sense and you can follow them. WSL's bash can cause problems, and I know the instructions are somewhat long and involved but I tried to have a good mix of hand-holding and robustly doing things on the users end; more advanced users will probably see shortcuts and alternative routes to get to the same endpoint.

Also, I've added batch-script wrappers for caf and cafrun. If the bin subdirectory of the opencoarrays installation directory is on your path, you should be able to call caf and cafrun natively, and hopefully (untested 🤞) this means you might be able to pass them as the compiler to fpm.

zbeekman commented 2 years ago

@rouson @everythingfunctional Thanks so much for the reviews! Many great suggestions and improvements.

@rouson Please see this comment (1) and this comment (2) about "Git-Bash" vs "Git BASH" and "a Git-Bash bash shell" vs "a Git BASH shell" respectively. Right now I think the original wording/phrasing should stand unless I'm misunderstanding something.

@everythingfunctional I would like my wording/explanation to be clearer too, but think that the repetitive phrasing should stay, as explained in this comment unless there is a better way to say it. (But I can't think of one.)

rouson commented 2 years ago

@zbeekman all looks great. Thanks again!