Closed regro-cf-autotick-bot closed 1 month ago
Hi! This is the friendly automated conda-forge-linting service.
I just wanted to let you know that I linted all conda-recipes in your PR (recipe/meta.yaml
) and found it was in an excellent condition.
PS release notes indicate that numpy2.0 is supported do you want to pull in the migration?
The file seems there, but maybe just not found:
$ find -name host_config.h
./_h_env/targets/x86_64-linux/include/host_config.h
./_build_env/targets/x86_64-linux/include/host_config.h
./_build_env/targets/x86_64-linux/include/crt/host_config.h
Only the Cuda builds are failing now, and I'm a little puzzled by this. The Cuda 11 builds fail with a very similar but not identical error.
cc: @jakirkham any ideas?
Just bumping this for visibility. I could imagine that it is a pretty straightforward fix, but I don't have the hardware to debug this easily. Any help is much appreciated!
unfortunately, even looking at the cuda 12 i can't figure it out on linux.
I think jakirkham was AFK last week, so maybe we can ping him again next?
Just going through some findings;
Th ediff between v1.19.2 shows that we still might need to disable installing requirements on windows.
if args.enable_pybind and is_windows():
- install_python_deps(args.numpy_version)
+ run_subprocess(
+ [sys.executable, "-m", "pip", "install", "-r", "requirements/pybind/requirements.txt"],
+ cwd=SCRIPT_DIR,
+ )
oh seems like you fixed pip for windows great!
@cbourjau just to confirm, you don't need a CUDA enabled machine on linux, just linux+docker.
I give up for today, it just doesn't seem like their build scripts changed all that much, and the previous build is found fine....
I'm afraid I won't have much time to spend on this in the next couple of weeks. Do you think it may be reasonable to temporarily disable the Cuda builds and to release the CPU-only 1.19.2 packages, @hmaarrfk ?
Do you think it may be reasonable to temporarily disable the Cuda builds
Is this really what you want? Generally speaking CUDA is a great enabling technology for ML.
Do you want to field the slew of questions that will come from CUDA users updating to 19.2 with CPU only support?
I would love it if instead you:
store_build_artifacts: true
https://conda-forge.org/docs/maintainer/conda_forge_yml/#azureGenerally this is what I would suggest for others that have "incomplete" packages.
However, I do understand that CUDA is alot of work, but the performance loss so great that it is almost worst to have a onnx package without cuda support......
I am however unable to help for the last few weeks so I can just as easily limit our version of onnx to 18..... (on my own private channel)
I think jakirkham was AFK last week, so maybe we can ping him again next?
Yeah was on PTO for a bit
Noticed that upstream made both a 1.18.2 and a 1.19.2 release around the same time
Given this is on 1.18.1, would it be worth trying 1.18.2? This might be a smaller step with fewer changes. Also it might give us the opportunity to fix a few issues before jumping to the 1.19.x series
Part of the sad sad thing that made me sad is that even rerendering failed on windows: https://github.com/conda-forge/onnxruntime-feedstock/pull/131
It's looking for nvcc
in the host
environment instead of the build
environment
Please see this CI job with snippet below:
CMake Error at CMakeLists.txt:735 (enable_language):
The CMAKE_CUDA_COMPILER:
C:/bld/onnxruntime_1725973208545/_h_env/Library/bin/nvcc
is not a full path to an existing compiler tool.
Compare this to where it finds cmake
-- CMake command : C:/bld/onnxruntime_1725973208545/_build_env/Library/bin/cmake.exe
Also commented in the other PR: https://github.com/conda-forge/onnxruntime-feedstock/pull/131#issuecomment-2390294030
Thanks, i'm trying to use git bisect to find the problematic commit here too...
Ok this seems like the offending commit: https://github.com/microsoft/onnxruntime/commit/7b3fff650a6ae2c8a3a6a4487cb76672fd21dde4
indeed reverting it gets the build scripts to pass and the compilation to start (on linux)
though maybe we should be specifying CUDA_HOME?
@jakirkham i'm not sure i know how to make the argument to upstream that they should provide this as an option, and not as a default.
This CI error looks kind of odd
Ignoring COMPILE_WARNING_AS_ERROR target property and CMAKE_COMPILE_WARNING_AS_ERROR variable.
CMake Deprecation Warning at CMakeLists.txt:14 (cmake_policy):
The OLD behavior for policy CMP0104 will be removed from a future version
of CMake.
The cmake-policies(7) manual explains that the OLD behaviors of all
policies are deprecated and that a policy should be set to OLD only under
specific short-term circumstances. Projects should be ported to the NEW
behavior and not rely on setting a policy to OLD.
CMake Error at C:/bld/onnxruntime_1727920444921/_build_env/Library/share/cmake-3.30/Modules/CMakeDetermineCCompiler.cmake:49 (message):
Could not find compiler set in environment variable CC:
cl.exe.
Call Stack (most recent call first):
CMakeLists.txt:20 (project)
CMake Error: CMAKE_C_COMPILER not set, after EnableLanguage
CMake Error: CMAKE_CXX_COMPILER not set, after EnableLanguage
CMake Error: CMAKE_ASM_COMPILER not set, after EnableLanguage
Does it need the full path to cl
?
Maybe we could use where cl
to fill out CC
and CXX
@conda-forge-admin please rerendre
@conda-forge-admin please rerender
Sorry to ask @xhochy but I don't have the machine to debug this:
Asking you since you seem interested in windows support too. https://github.com/conda-forge/onnxruntime-feedstock/pull/97
I am VERY rusty on windows skills but:
C:\Program Files\Microsoft Visual Studio\2022\Community>if "14.41-14.40" == "14.40-14.41" (CALL "VC\Auxiliary\Build\vcvars64.bat" -vcvars_ver=14.41 10.0.26100.0 ) else (CALL "VC\Auxiliary\Build\vcvars64.bat" -vcvars_ver=14.40 10.0.26100.0 )
**********************************************************************
** Visual Studio 2022 Developer Command Prompt v17.11.4
** Copyright (c) 2022 Microsoft Corporation
**********************************************************************
[ERROR:vcvars.bat] Toolset directory for version '14.40' was not found.
[ERROR:VsDevCmd.bat] *** VsDevCmd.bat encountered errors. Environment may be incomplete and/or incorrect. ***
[ERROR:VsDevCmd.bat] In an uninitialized command prompt, please 'set VSCMD_DEBUG=[value]' and then re-run
[ERROR:VsDevCmd.bat] vsdevcmd.bat [args] for additional details.
[ERROR:VsDevCmd.bat] Where [value] is:
[ERROR:VsDevCmd.bat] 1 : basic debug logging
[ERROR:VsDevCmd.bat] 2 : detailed debug logging
[ERROR:VsDevCmd.bat] 3 : trace level logging. Redirection of output to a file when using this level is recommended.
[ERROR:VsDevCmd.bat] Example: set VSCMD_DEBUG=3
[ERROR:VsDevCmd.bat] vsdevcmd.bat > vsdevcmd.trace.txt 2>&1
conda_build.bat
call conda_build.bat
C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.41.34120\bin\Hostx64\x64
I installed more things from window's mega downloader and will restart my computer and try again.
@conda-forge-admin please rerender
well i have hope!
ok clearly my hope was misguided. @jakirkham any clue? even if just for CUDA 12.
So on the newer 19.41 compiler version, there exists a if statement:
#ifndef _ALLOW_COMPILER_AND_STL_VERSION_MISMATCH
#if defined(__CUDACC__) && defined(__CUDACC_VER_MAJOR__)
#if __CUDACC_VER_MAJOR__ < 12 || (__CUDACC_VER_MAJOR__ == 12 && __CUDACC_VER_MINOR__ < 4)
_EMIT_STL_ERROR(STL1002, "Unexpected compiler version, expected CUDA 12.4 or newer.");
#endif // ^^^ old CUDA ^^^
#elif defined(__EDG__)
// not attempting to detect __EDG_VERSION__ being less than expected
#elif defined(__clang__)
#if __clang_major__ < 17
_EMIT_STL_ERROR(STL1000, "Unexpected compiler version, expected Clang 17.0.0 or newer.");
#endif // ^^^ old Clang ^^^
#elif defined(_MSC_VER)
#if _MSC_VER < 1940 // Coarse-grained, not inspecting _MSC_FULL_VER
_EMIT_STL_ERROR(STL1001, "Unexpected compiler version, expected MSVC 19.40 or newer.");
#endif // ^^^ old MSVC ^^^
#else // vvv other compilers vvv
// not attempting to detect other compilers
#endif // ^^^ other compilers ^^^
#endif // !defined(_ALLOW_COMPILER_AND_STL_VERSION_MISMATCH)
I should be able to define _ALLOW_COMPILER_AND_STL_VERSION_MISMATCH
but I am unable to.
Instead if I manually change:
__CUDACC_VER_MINOR__ < 4
to
__CUDACC_VER_MINOR__ < 0
it effectively disables the check and allows one to move forward.
ok now fingers crossed.
is there any way to squeeze a bit more RAM out on windows?
edit: added 'on windows'
Sorry to ask @xhochy but I don't have the machine to debug this:
Asking you since you seem interested in windows support too. #97
I work together with @cbourjau and we are only interested in CPU support. GPUs are not bringing us noticeable performance improvement for our workloads. Also Windows is just a nice-to-have 🤷 Thus, I'm not so motivated to spend time on fixing this.
is there any way to squeeze a bit more RAM out on windows?
There are some things that could be cleaned from the images: https://github.com/conda-forge/conda-smithy/pull/1966
That removal process is a bit slow though. If you know ways to make that faster, please let me know
Otherwise we could just borrow from those and include them in the build here. For example this is what we did in matplotlib
to address a similar issue
There are some things that could be cleaned from the images: https://github.com/conda-forge/conda-smithy/pull/1966
i think i got an out of memory error, so not disk related. would those cleanups help freeup some RAM space?
I work together with @cbourjau
hehe, good to know!
and we are only interested in CPU support.
Interesting...
Also Windows is just a nice-to-have 🤷 Thus, I'm not so motivated to spend time on fixing this.
haha, your introduction of ONNX+windows was a motivator for me to start to bring my application to windows :)
Lets see if we can squeeze a bit more performance before reducing the number architectures.
i think i got an out of memory error, so not disk related. would those cleanups help freeup some RAM space?
My understanding of how CI is setup they all count against the same limit
My understanding of how CI is setup they all count against the same limit
ok lets try.
Sound good. Would grab a coffee :)
It took about 1 hour for the cuda on an 11th gen 9 intel processor.
Builds and about 16GB of RAM of extra RAM. My Windows RAM usage went up to 24GB + "16cache".
The largerst an NVCC/ICCI took was visually 2-3 GB of RAM. so I feel like it should be able to handle it on a CI, but maybe they are wimpier than I would think.
may i ask what the "novec" and "vec" options mean? perhaps since there is little interest in the windows builds we can avoid 1/2 of them?
I would like to try to maintain windows build at least for "modern" hardware
may i ask what the "novec" and "vec" options mean?
I added the novec
variant in https://github.com/conda-forge/onnxruntime-feedstock/pull/36. eigen
has its own way to vectorize floating point operations which results in (slightly) inconsistent predictions depending on whether we pass 1 row or multiple rows into a model. The novec
build compiles eigen with EIGEN_DONT_VECTORIZE
. I think we don't need CUDA builds for the novec
variant (I'm assuming the flag doesn't do anything for CUDA kernels anyways). Also, feel free to drop the novec
Windows builds (until someone asks for them).
I work together with @cbourjau
hehe, good to know!
As we're already at it, full disclosure: All maintainers except you on this feedstock are employed by QuantCo and use onnxruntime for the same use case. My windows support was mainly to enable other users besides QuantCo and I'm also happy to help a bit more than needed, but Windows+CUDA is a combination where I lack any expertise.
Probably I the one that added CUDA support on Windows and is interested in it. I sometimes miss PRs related to Windows+CUDA in this repo, my bad, but I am interesting in making sure that they work, and happy to help (happy to also be listed as a maintainer so in that case I would be added as a reviewer to PRs, that increase the chance that I lose some PRs).
I can confirm that I am not interested in novec
builds on Windows.
Hi! This is the friendly automated conda-forge-linting service.
I wanted to let you know that I linted all conda-recipes in your PR (recipe/meta.yaml
) and found some lint.
Here's what I've got...
For recipe/meta.yaml:
@traversaro you'll be added on the next rerender. I'm trying to cut the cuda architectures, to get it to work on the CIs..... but..... we are now at 80,86,90 which is likely too few even for my taste.
Are you able to build out the windows matrix?
Hi! This is the friendly automated conda-forge-linting service.
I just wanted to let you know that I linted all conda-recipes in your PR (recipe/meta.yaml
) and found it was in an excellent condition.
@conda-forge-admin please rerender
Are you able to build out the windows matrix?
I can look into that, but it will probably take me a few days. If this is blocking, probably we can skip Windows for the time being? By skipping the whole Windows build (instead of just CUDA) we avoid the problem of people updating and ending up with CPU-only onnxruntime.
Closes #131 Closes #133
It is very likely that the current package version for this feedstock is out of date.
Checklist before merging this PR:
license_file
is packagedInformation about this PR:
please add bot automerge
in the title and merge the resulting PR. This command will add our bot automerge feature to your feedstock.bot-rerun
label to this PR. The bot will close this PR and schedule another one. If you do not have permissions to add this label, you can use the phrase code>@<space/conda-forge-admin, please rerun bot in a PR comment to have theconda-forge-admin
add it for you.Pending Dependency Version Updates
Here is a list of all the pending dependency version updates for this repo. Please double check all dependencies before merging.
This PR was created by the regro-cf-autotick-bot. The regro-cf-autotick-bot is a service to automatically track the dependency graph, migrate packages, and propose package version updates for conda-forge. Feel free to drop us a line if there are any issues! This PR was generated by - please use this URL for debugging.