Clarification to zeInit() description

HoppeMateusz commented 1 year ago

Current description allows calling zeInit() multiple times with different environment variables. https://spec.oneapi.io/level-zero/latest/core/api.html#zeinit

The application may call this function multiple times with different flags or environment variables enabled.

It should be stated that calling zeInit() with the same flags and different Environment Variables will not have effect on Driver - as driver is initialized once

Only one instance of each driver will be initialized per process.

And that spec defined env vars (https://spec.oneapi.io/level-zero/latest/core/PROG.html#environment-variables) will only be honored at first initialization.

jandres742 commented 1 year ago

thanks @HoppeMateusz . There have been requests from customers to modify that behavior, and that actually multiple calls to zeInit to work.

Of course, in terms of the L0 GPU driver, that would imply a complex refactoring of code. But putting aside implementation details, i think the first question here to answer is:

what is the best behavior for customers:

that only first zeInit is valid and only then env vars are taken?
or should spec relax this to allow for multiple calls to zeInit to take place, maybe changing values of env vars?

MichalMrozek commented 1 year ago

It is not even possible to refactor the code, you would need to change the whole specification. If you allow mulitple zeInit with different variables, then you can have a scenario where within one process one library calls zeInit to use GPU , then another library calls ze init to use VPU only and this second call would invalidate all submissions in flight done by the first zeInit call. You would need to update all entry points to reflect that.

The reason why you have single initialization is to have single point in time where you set up the driver and all associated classes. if you allow this step to happen multiple times, you would create gigantic overhead as you would need to introduce many checks for thing that were immutable to see if they changed. This would sacrifice a lot of L0 efficiency and would create horribly complex driver implementation that wouldn't be maintainable in the long run.

The only way to really move forward and have efficiency is to update the spec that only first initialization is valid and subsequent ones are not updating anything.

jandres742 commented 1 year ago

thanks @MichalMrozek . The problem here is this:

The reason why you have single initialization is to have single point in time where you set up the driver and all associated classes.

In a multi-library application, there's no single point in time where to call zeInit. Imagine an HPC application with the following libraries:

SYCL
MKL
Communication Libraries or MPI for internode communication
Library for intranode communication, like libfabric
Library for profiling

Each of these may call zeInit(), each with different requirements. For instance, the profiling tool may need tools and tracing, but if the zeInit from SYCL comes first, then tools and tracing may not be used. Or you have the communication libraries or MPI using multiple ranks (processes), and some use CPU and other GPU, each initializing L0 differently. So the single point of entry actually becomes a data race, depending on which library loads first.

As you say, fully supporting that mode would provide an enormous overhead, so maybe something in middle could be provided. Maybe zeInit can allow for incremental initialization (e.g., if zeInit has initialized a GPU, then later it can initialize a CPU, but not remove the GPU), or maybe we can find other alternatives.

MichalMrozek commented 1 year ago

That's why zeInit shouldn't have any parameters and always expose all devices.

Incremental initialization is the same problem, it has enormous overhead as you cannot assume that some portions of driver are already initialized and will not change in future, if you need to assume that they may increment at any point of time, that's where you have additional overhead.

If you need to add some capabilities in the middle like tracing, this should be via new APIs, not via zeInit which is already heavily overloaded.

jandres742 commented 1 year ago

thanks @MichalMrozek . I agree with this:

this should be via new APIs, not via zeInit which is already heavily overloaded.

I think instead of relying on environment variables and flags passed to zeInit, we can have explicit APIs, so each component initializes what it needs. zeInit will take care only of general initialization, but other things could be taken care of by extra APIs.

oneapi-src / level-zero-spec

Clarification to zeInit() description #232