Closed bgoglin closed 2 months ago
@bgoglin zesInitbased implementation is not fully complete and we are working on it and please let me know your test / reproduction step where you are seeing unsupported , will investigate and update you
I installed packages from the latest compute-runtime release (https://github.com/intel/compute-runtime/releases/tag/23.13.26032.30) on a Debian bookworm running Linux kernel 6.1. This is a laptop with an old Intel(R) Core(TM) i7-7600U CPU @ 2.80GHz. Test program is just does zesInit(0) and prints the return value. I get 0x78000003 = ZE_RESULT_ERROR_UNSUPPORTED_FEATURE
thanks @bgoglin we will look into it and update you the fix
I just discovered that zesInit() returns 0 instead of 0x78000003 if ZES_ENABLE_SYSMAN=1 is set in the environment. If took me quite a bit of time to understand why things weren't failing anymore, so I am writing it here in case it helps you reproduce the issue.
Looks like zesInit() works now? Using these packages: intel-igc-core_1.0.14828.8_amd64.deb intel-igc-opencl_1.0.14828.8_amd64.deb intel-level-zero-gpu_1.3.26918.9_amd64.deb intel-opencl-icd_23.30.26918.9_amd64.deb libigdgmm12_22.3.0_amd64.deb It didn't with those: intel-igc-core_1.0.14062.11_amd64.deb intel-igc-opencl_1.0.14062.11_amd64.deb intel-level-zero-gpu_1.3.26516.18_amd64.deb intel-opencl-icd_23.22.26516.18_amd64.deb libigdgmm12_22.3.0_amd64.deb If you know which package and version first brought the fix, I'd like to know it so that we tell people avoid earlier releases.
From git log, I'd say it's this commit:
commit 73b40a49dcd1b320309cdefd24a2a2b23149ee01
Author: Kulkarni, Ashwin Kumar <ashwin.kumar.kulkarni@intel.com>
Date: Wed Jun 7 10:10:13 2023 +0000
fix(sysman): Enables zesInit flow without setting ZES_ENABLE_SYSMAN
Related-To: LOCI-4458
which first appeared in tag 23.26.26690.12.
@bgoglin can we close this issue now? if zesInit is working for you ?
The question was rather: Could you explictly list which releases have a unsupported zesInit() ? We are going to tell users to avoid these releases. Either they use newer releases with zesInit(). Or they use older releases without zesInit() at all and we'll set ZES_ENABLE_SYSMAN=1 in the library constructor.
please start using new releases with zesInit ()
The question was rather: Could you explictly list which releases have a unsupported zesInit() ? We are going to tell users to avoid these releases. Either they use newer releases with zesInit(). Or they use older releases without zesInit() at all and we'll set ZES_ENABLE_SYSMAN=1 in the library constructor.
please start using new releases with zesInit ()
@saik-intel compute-runtime
project releases multiple versions in parallel, and customers may get old version of it from distros. You did not answer the question or the subject of this ticket; from which releases onward there's (full) zesInit()
support?
zesInit()
stub was introduced on Feb 2023 in commit bf481e61034.
From git log, I'd say it's this commit:
commit 73b40a49dcd1b320309cdefd24a2a2b23149ee01 Author: Kulkarni, Ashwin Kumar <ashwin.kumar.kulkarni@intel.com> Date: Wed Jun 7 10:10:13 2023 +0000 fix(sysman): Enables zesInit flow without setting ZES_ENABLE_SYSMAN
which first appeared in tag 23.26.26690.12.
When looking at last tag series preceding the 23.26.x one:
$ for tag in $(git tag); do echo "$tag: $(git show --format=fuller $tag|grep ^CommitDate)"; done | grep 2023
...
23.22.26516.33: CommitDate: Tue Dec 12 18:45:47 2023 +0100
23.22.26516.34: CommitDate: Tue Dec 19 11:15:15 2023 +0100
23.22.26516.8: CommitDate: Wed Jun 14 11:01:51 2023 +0200
23.26.26690.12: CommitDate: Tue Jul 25 14:34:41 2023 +0200
23.26.26690.13: CommitDate: Tue Jul 25 16:26:36 2023 +0200
...
23.22.26516.34
does not contain the above indicated commit, only 23.26.x series and ones after it do.
=> anything with tag number smaller than 23.26.26690.12
needs to be avoided for zesInit()
to work as expected.
@bgoglin Does above answer your question i.e. can this be closed?
"anything with tag number smaller than 23.26.26690.12", do you also mean released in 2023 ? (equivalent to starting with "23." ?)
Looking at the Git content... While L0 1.5 i.e. zesdInit()
support is already in earlier versions, last tag in a preceding 23.22.x series (from Dec 2023) did not include 73b40a49dcd1b320309cdefd24a2a2b23149ee01 commit, and last tags in series before that, are all before that commit was made, and check the env variable.
@bgoglin can we close this issue now ?
@saik-intel While relevant backend version is now known, i.e. this can be closed, it still requires manual setup outside the L0 application itself.
Proper support would require also L0 frontend (init return values) to be fixed, so that clients can differentiate between total backend failure (due to missing HW or wrong kernel driver), and missing backend functionality (due to too old backend version). See: https://github.com/oneapi-src/level-zero/issues/140
Agreed, this issue can be closed but user code is still hard to maintain.
zesInit() might be the right way to finally fix oneapi-src/level-zero#36 but current releases provide zesInit() without actually implementing it (it returns "unsupported feature" in 26032.30).
I know we are supposed to handle API errors, but this means we should do a runtime check at configure time before enabling zesInit() in our code. We can't do a runtime-check at runtime because it would be too late to fallback to setting ZES_ENABLE_SYSMAN=1 (has to be done very early in the process, ideally inside a library constructor).
When are you planning to publish a compute-runtime release with proper support for zesInit()? It seems to in the git tree already.
Could you explictly list which releases have a unsupported zesInit() ? We are going to tell users to avoid these releases. Either they use newer releases with zesInit(). Or they use older releases without zesInit() at all and we'll set ZES_ENABLE_SYSMAN=1 in the library constructor.