Cambridge-ICCS / FTorch

A library for directly calling PyTorch ML models from Fortran.
https://cambridge-iccs.github.io/FTorch/
MIT License
76 stars 15 forks source link

Issues building on Windows #50

Closed jatkinson1000 closed 11 months ago

jatkinson1000 commented 1 year ago

Copied from an email chain with a user:

I hope it's alright that I'm reaching out. I've recently set up a framework to incorporate physics-informed, neural net-based, user material subroutines in Abaqus. The framework is quite simple and doesn't take full advantage of the neural net setup. I would be very interested in coupling the PyTorch models directly to Fortran, and so, I'd be very interested in exploring FTorch.

I had a quick question - I've been trying to install the library, but the CMake configuration fails to identify the Fortran compiler. I've tried:

set(CMAKE_Fortran_COMPILER "/MinGW/bin/gfortran.exe") added to the CMakeLists.txt
cmake .. -DCMAKE_Fortran_COMPILER="/MinGW/bin/gfortran.exe" in cmd.

With both, I get the following error:

It fails with the following output:

    Change Dir: C:/Users/USER/FTorch/src/build/CMakeFiles/CMakeScratch/TryCompile-mb6xhj
    Run Build Command(s):devenv.com CMAKE_TRY_COMPILE.sln /build Debug /project cmTC_03248 && The system cannot find the file specified
    Generator: execution of make failed. Make command was: devenv.com CMAKE_TRY_COMPILE.sln /build Debug /project cmTC_03248 && 

Would you be able to help me work this out? What am I missing here?

Thank you for taking the time, I truly appreciate it.

jatkinson1000 commented 1 year ago

We would be happy to try and help you out.

It looks like you are using Windows? I am not overly experienced with Windows, but can try and work this out.

Are you running this from within VSCode, as I know there can be issues locating a Fortran compiler from inside that environment.

As a first step could you try adding the following flag to the call to CMake:

-DCMAKE_GENERATOR="MinGW Makefiles"

if this doesn't work please also try adding the following to the CMake command:

-G "MinGW Makefiles"

Also note that CMake can cache some files that cause issues if you change the configuration and rebuild, so I suggest removing the build/ directory before you re-run CMake each time.

Finally could you also send us the list of commands you are running from the git clone through to the error?

jatkinson1000 commented 1 year ago

I am using Windows, and I'm running this using the Command Prompt. These are the commands I'm running to begin with:

cd FTorch/src
mkdir build
cd build

Then using cmake .. -DCMAKE_GENERATOR="MinGW Makefiles" results in:

image

I've seen this error before - adding set(CMAKE_PREFIX_PATH "/Users/USER/libtorch/share/cmake/Torch") to line 8 of the CMakeFiles.txt resolves it. and lets cmake complete.

Then, when I attempt to make the code, this is what I get:

image

I've tried looking up a solution for this yesterday but couldn't find one. Any clue?

Thanks a lot for this, I appreciate it.

jatkinson1000 commented 1 year ago

The info you provided is useful. It looks like you are missing some flags from the CMake command as described in the table under point 3 here: https://github.com/Cambridge-ICCS/FTorch#library-installation Perhaps the documentation could be clearer that you DO need these for it to work, they're not all optional.

Please can you try running using the normal CMakeLists.txt with:

cmake .. -DCMAKE_BUILD_TYPE=Release -DCMAKE_GENERATOR="MinGW Makefiles" -DCMAKE_PREFIX_PATH=<PATH_TO_TORCH>

You will need to replace '<PATH_TO_TORCH>' by the location of the pytorch install (this would be </path/to/venv/>lib/python<3.xx>/site-packages/torch/ on a linux install with Torch installed in a virtual environment) as described under the table of input flags.

I would also suggest adding: -DCMAKE_INSTALL_PREFIX and setting it equal to where you want to install the library - it can be useful to place it somewhere you know the location of whilst geting things set up. On linux I would use something like $HOME/FTorchbin/

Let me know how you get on with this.

jatkinson1000 commented 1 year ago

Thank you for your response, I appreciate it.

Here's where I'm at:

image

So, including -DCMAKE_PREFIX_PATH="/Users/USER/libtorch/share/cmake/Torch" with the normal CMakeFiles.txt resolves the TorchConfig.cmake and torch-config.cmake error (I guess it's the same as adding set(CMAKE_PREFIX_PATH "/Users/USER/libtorch/share/cmake/Torch") to the CMakeFiles.txt).

When I run make, the c++.exe errors come up again. Could this have something to do with my installations?

jatkinson1000 commented 1 year ago

Yes, that has the same effect as specifying the prefix in the CMakeLists.

So I'm not 100% sure, but I think that error may associated with CUDA which is part of the GPU infrastructure.

Do you have a GPU on your system? If not, could you try specifically downloading the CPU only version of torch and linking to that?

If downloading to a virtual environment via pip I think this is: pip3 install torch on windows

If this is what you already did please could you try downloading libtorch direct and linking to that? Instructions here: https://pytorch.org/get-started/locally/

I think you can get it here: https://download.pytorch.org/libtorch/cpu/libtorch-win-shared-with-deps-2.1.0%2Bcpu.zip

Let me know if you get anywhere!

jatkinson1000 commented 1 year ago

After looking a bit further it seems that pytorch isn't officially supported with minGW at the moment. You could try building libtorch from source using minGW but this isn't officially supported and could take a while.

https://stackoverflow.com/questions/76153651/why-is-pytorch-failing-to-build-with-mingw

https://github.com/pytorch/pytorch/issues/24460#issuecomment-1383277152

If it's still not working do you have a different set of compilers/generators you can use? It looks like the MSVC (visual studio) compilers work.

jatkinson1000 commented 1 year ago

Thank you for getting back to me, your feedback has been great help.

I have Libtorch downloaded locally on my device for CPU. Thank you for forwarding me the links. I came across similar material, and thought I could install ifort to test if I could use it as a compiler (instead of gfortran). Here's what I have:

image

I think the compiler is failing, but I'm not sure what I'm missing here - could it have to do with how cmake is set up on my device? Should I not be doing this on cmd? I also ran this from within Visual Studio, but no luck.

jatkinson1000 commented 1 year ago

OK, that looks like an issue with ifort.

Can you send the output of running CMake with the -v flag for verbose output added? i.e. cmake .. -v -D[rest of the cmake command...]

It would also be useful to send the contents of: CMakeFiles/CMakeError.log

Finally, can you check the version of Windows you are running (32 or 64 bit): https://support.microsoft.com/en-us/windows/which-version-of-windows-operating-system-am-i-running-628bec99-476a-2c13-5296-9dd081cdd808

jatkinson1000 commented 1 year ago

Thank you for responding, I appreciate it.

Here's what I have when I set the flag -DCMAKE_Fortran_COMPILER=ifort (and I've included --debug-try compile):

image
jatkinson1000 commented 1 year ago

OK, it looks like CMake expects you to supply the full path to a compiler as it seems you have not added added it to your PATH, so what you were doing before was OK, apologies.

I have looked on a windows installation and managed to build, so it's definitely possible. To do this I followed the following steps/software:

Do these compilers/setup match yours?

jatkinson1000 commented 1 year ago

Okay - yes, that's the same setup I've got. I think I'll uninstall and reinstall everything, maybe that helps. Do you happen to know which versions of everything you used?

jatkinson1000 commented 1 year ago

I have added some Windows instructions to a branch here: https://github.com/Cambridge-ICCS/FTorch/tree/windows-build-instructions

This changes the process a little and adds some notes for Windows, could you look at these and let me know how you get on?

Once we have something correct we'll merge it into the main repo.

jatkinson1000 commented 1 year ago

Thanks for getting back to me, this has been great help.

I've reinstalled the following setup:

I'm now able to build (I'm running in administrator mode). Here is the command I'm running:

cmake .. -DCMAKE_BUILD_TYPE=Release -DCMAKE_PREFIX_PATH="/Users/Farah Rabhie/libtorch/share/cmake/Torch" -DCMAKE_Fortran_COMPILER="/Program Files (x86)/Intel/oneAPI/compiler/2023.2.0/windows/bin/intel64" -DCMAKE_INSTALL_PREFIX="/Users/Farah Rabhie/FTorch/Release"

I'm also able to make the library, but I've not been able to install it. I'm running cmake --build . --target install. Here are the errors:

image

What am I missing here?

jatkinson1000 commented 1 year ago

First, could you try running cmake as you report is working, but then instead of cmake --build . --target install all in one try:

cmake --build .
cmake --install .

And send the output.

If this fails please could you also try running cmake as you report is working, but then:

msbuild install.vcxproj
msbuild INSTALL.vcxproj

and send the output.

After this if there are sill issues please try replacing line 68 of the CMakeLists file (install(FILES "${CMAKE_BINARY_DIR}/modules/ftorch.mod") by:

install(FILES "${CMAKE_CURRENT_BINARY_DIR}/modules/ftorch.mod"

And try rerunning then send the output.

jatkinson1000 commented 1 year ago

Note from working on this - we should emphasise that in windows where filepaths etc can have spaces it is important to enclose command line arguments in quotes.

TomMelt commented 1 year ago

Hi @jatkinson1000 @mondus, thanks for all your input. I think I have managed to successfully install it.

Here are my steps (for Windows 10 VM):

  1. You need to install oneapi fortran compiler as discussed previously (You need Visual Studio 2022 to install dependencies and it will be used later)
  2. Install anaconda and make an environment containing torch stuff (mine is called torch in this example)
  3. From windows start menu run "Intel oneAPI command prompt for Intel 64 for Visual Studio 2022"
  4. Load your conda environment (torch)
    • first add conda path to PATH using: set PATH=%PATH%;C:\Path\to\anaconda3;C:\Path\to\anaconda3\Scripts
    • activate torch (Not conda activate torch like in linux)
  5. cmake -DCMAKE_PREFIX_PATH="C:\pathto\torch" -G "NMake Makefiles" -DCMAKE_C_COMPILER=icl -DCMAKE_Fortran_COMPILER=ifort -DCMAKE_CXX_COMPILER=icl .. image
  6. cmake --build .
  7. cmake --install . image
TomMelt commented 1 year ago

Just in case it's helpful, I have no idea what -G "NMake Makefiles" actually does but without it, the intel C and CXX compiler flags are completely ignored. (edit "NMake will force it to use the windows version of make rather than MSBuild (via VS)")

I also needed to activate the torch environment even though I don't think it should be necessary.

In theory you should be able to source intel compilers inside and anaconda cmd line but again I found this more difficult than activating conda inside of the intel cmd.

jatkinson1000 commented 1 year ago

Thanks @TomMelt That's great, thanks.

Couple of follow-ups:

TomMelt commented 1 year ago

Thanks @TomMelt That's great, thanks.

Couple of follow-ups:

* Is it possible to build for `Release` rather than `Debug`?

Yes, just change Debug to Release for build and install steps

* Have you run any of the examples at all? I'd be interested to see if they work (See [Investigate intel C compiler #54](https://github.com/Cambridge-ICCS/FTorch/issues/54) with `icl`)

Nope, not yet. Building libtorch is a different issue though. I used the pip version of torch.

jatkinson1000 commented 1 year ago

Yes, just change Debug to Release for build and install steps

Presumably also change the command line flag to -DCMAKE_BUILD_TYPE=Release? Have you tried this, because when @mondus and the person who raised the issue tried with Release CMake still put everything in Debug than had issues installing.

Nope, not yet. Building libtorch is a different issue though. I used the pip version of torch.

No need to build libtorch - you should be able to get a binary. There are some reports that libtorch cannot be accessed from code compiled with icl or icc so interested to see if this is correct.

TomMelt commented 1 year ago

I have just completed a powershell aand cmd build using libtorch downloaded as a zip file

image

using powershell

cmd /k '"C:\Program Files (x86)\Intel\oneAPI\setvars.bat" && powershell'
cmake -G "NMake Makefiles" -DCMAKE_PREFIX_PATH="C:\Users\melt\Downloads\libtorch-win-shared-with-deps-2.1.0+cpu\libtorch" -DCMAKE_BUILD_TYPE=Release ..
cmake --build .
cmake --install .

using cmd

"C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
cmake -G "NMake Makefiles" -DCMAKE_PREFIX_PATH="C:\Users\melt\Downloads\libtorch-win-shared-with-deps-2.1.0+cpu\libtorch" -DCMAKE_BUILD_TYPE=Release ..
cmake --build .
cmake --install .
mondus commented 1 year ago

Just to summarise our recent meeting.

  1. It is not necessary to be in a conda environment. It is only required that libtorch is available (pip install of pytorch being one way of getting libtorch). Torch could also be installed via binary releases (https://pytorch.org/get-started/locally/).
  2. It is not required to open OneAPI command shell but it is required to set the compiler environment variables for OneAPI to ensure that NMake is available. If OneApi is installed in the default location then this is a case of running "C:\Program Files (x86)\Intel\oneAPI\setvars"
  3. Nmake is roughly a Windows version of Make which replaces MSBuild (i.e. build via Visual Studio). It is required to generate the mod files. This is perhaps a limitation of MSBuild and the interaction with the OneAPI. As such the build type can be set at configure time. E.g. cmake -G "NMake Makefiles" -DCMAKE_PREFIX_PATH="c:\pathto\torch" -DCMAKE_BUILD_TYPE=Release ...
  4. Build can be invoked using cmake --build . and the build type is not required when using NMake.
  5. This has the same limitations in terms of file structure as Linux builds (see #14 )