Policies for packages that use OpenCL

peastman commented 5 years ago

This issue is a followup to https://github.com/conda-forge/openmm-feedstock/pull/7#. We are trying to move OpenMM to be distributed through conda-forge, and some policy questions came up that need to be resolved. In particular, we have been told conda-forge has an established policy that any package that uses OpenCL is required to list ocl-icd as a dependency. We do not want to list it as a dependecy: given our particular needs, there doesn't exist any situation where it would benefit our users, and there do exist situations where it would cause problems.

Let me give some background information, since it will help to clarify the issues we're dealing with. OpenMM is a widely used package for molecular dynamics simulation. It can use OpenCL, but it doesn't require it. It comes with multiple computational backends: CUDA, OpenCL, and CPU. It automatically selects which one to use at runtime based on the available hardware and software (whether or CUDA and/or OpenCL is installed). The CUDA and OpenCL backends exist mainly for users with high end GPUs, which can be much faster than CPUs. For users who don't have one of those, it's generally better to use the CPU backend.

Accurate results are critical. This is scientific software, and errors could lead to users having to retract papers and losing months or years of work. We only support a small number of OpenCL implementations. On Linux, we support the ones from AMD and NVIDIA. Neither of those is conda installable. They are included with the video drivers. On macOS, we support Apple's implementation, which is built into the OS. These are the only ones we test. Using any other implementation would have a high risk of producing incorrect results. Our installation instructions tell users how to install appropriate OpenCL implementations. We are very concerned about anything that might lead to a user inadvertently using a different implementation they didn't even know was there.

OpenMM has APIs for C++, C, Fortran, and Python. So although we use conda as a distribution mechanism, it is part of a much larger ecosystem than just conda-forge, conda, or even Python. It needs to work correctly with all other libraries on the OS, regardless of how they are installed or what language they are used from.

So that's the background. Now let me describe our particular concerns.

Given our needs, there simply is no benefit to installing ocl-icd. None of the OpenCL implementations we support is conda installable.
Installing it creates new opportunities for the user to get incorrect results by silently using a different OpenCL implementation than the supported one we told them to install.
ocl-icd loads ICD files from a nonstandard location. Whereas all standard ICD loaders on Linux look for them in /etc/OpenCL/vendors, ocl-icd looks for them in a location inside the conda installation. If users want to use any OpenCL that was installed system-wide (such as the ones from the GPU vendors), they have to add a symlink pointing to the standard location. So including ocl-icd makes life harder for users. (A workaround is to also install ocl-icd-system, which creates the symlink for them, but it appears few packages currently do this.)
On macOS there is an additional, very serious problem: ocl-icd is written in a way which is incompatible with Apple's OpenCL framework. If two packages are loaded at the same time, one that links to ocl-icd and the other that links directly to Apple's framework (as nearly all OpenCL-based codes do on macOS), this will result in a crash. Linking to ocl-icd would therefore make OpenMM incompatible with nearly all other OpenCL based software that comes from any source other than conda-forge.

Given these issues, I suggest the existing policy should be reconsidered. Packages should not be required to depend on ocl-icd unless they have a specific reason to depend on it. Even better, I would suggest it should be a dependency for implementations of OpenCL, not for packages that use OpenCL. This is how the ICD mechanism was designed to work in the first place. ICD loaders are normally included with the implementation. They are never included with individual applications that just happen to use OpenCL. Tying the loader to the application rather than the implementation is contrary to the design of the mechanism.

This would make life easier for users of all OpenCL based packages. If they haven't specifically conda installed an OpenCL implementation, ocl-icd would not block the standard implementation that came with their GPU drivers from being found.

I also suggest that any use at all of ocl-icd on macOS should be carefully reconsidered. Until it can be made to coexist properly with the standard framework built into the OS, it just isn't appropriate for any package that is intended to be widely distributed.

cc: @jchodera @jaimergp

jchodera commented 5 years ago

A bit more info on OpenMM, for the curious:

OpenMM website: http://openmm.org/
GitHub repo: https://github.com/openmm/openmm

CJ-Wright commented 5 years ago

Thank you for putting this in. @conda-forge/core thoughts?

I'm not certain how to turn this into generic policy yet.

scopatz commented 5 years ago

If you are worried about reproducibility it really seems like the best option is to have conda manage the ocl-icd and any OpenCL packages all the way down the stack. Then you can preciesly pin against those particular builds that you know you want to use.

peastman commented 5 years ago

Since the only OpenCL implementations we support are ones that aren't conda installable, that isn't really relevant to us. Besides, pinning GPU drivers generally isn't an option.

scopatz commented 5 years ago

Why aren't they conda installable? ocl-idc could be it looks like. And other ones could probably have CDTs for them

isuruf commented 5 years ago

@scopatz, ocl-icd is an ICD loader for OpenCL implementations. These implementations are indicated with a text file <name>.icd in /etc/OpenCL/vendors usually. Some implementations allow linking to the OpenCL implementation directly which means your application can't use other implementations. Some ICD loaders provide a way to override this vendors directory. ocl-icd packaged in conda looks in $PREFIX/etc/OpenCL/vendors and installing ocl-icd-system will look in both $PREFIX/etc/OpenCL/vendors and /etc/OpenCL/vendors.

Some OpenCL implementations like Apple and NVIDIA are proprietary and we can't package them.

What @peastman is suggesting is to not use ICD loader and allow whatever ICD loader is in the system on Linux and to use only Apple implementation in OSX.

I'm strongly against this idea. Other OpenCL applications in conda-forge use the ICD loader from conda-forge and I'd like to keep the consistency.

You can't block other OpenCL implementations if you use ocl-icd-system because a user might install another implementation in /etc/OpenCL/vendors.

You can however block other conda OpenCL implementations by using the run_constrained trick (but I don't see the point as this won't block implementations installed in /etc/OpenCL/vendors that OpenMM don't support that).

peastman commented 5 years ago

@scopatz let me give a little background on the ICD mechanism. I think that might clarify things.

When Khronos was developing OpenCL, they knew there would be multiple implementations. For example, they figured each GPU manufacturer would create its own implementation. They also figured you might want multiple implementations installed at once, for example one for GPU and one for CPU. So they split the interface from the implementation. libOpenCL is the library that applications link to. It contains the interface but no implementation. When it gets loaded, it looks in a standard location (/etc/OpenCL/vendors on Linux, C:\System\OpenCL\vendors on Windows) for "Installable Client Driver" (ICD) files. Those point to the libraries containing particular implementations.

The intended behavior is that when an OpenCL implementation is installed, the installer will 1) put its libraries in some vendor-specific place, 2) put an ICD file pointing to them in the standard location, and 3) install a copy of libOpenCL in a standard location for system libraries, such that it will always be found by applications.

That's how it works on Linux and Windows, but Apple chose a different approach. They keep tight control over their platform. For example, they don't let GPU vendors install their own drivers. Everything has to go through Apple. And they made OpenCL work the same way. They have their own implementation (OpenCL.framework) which is included as part of the OS. It supports all GPU models that are found in Macs, but doesn't provide any mechanism for users to install other implementations.

(Note: I'm not commenting on whether this is a good system. Obviously there are objections that can be raised to Apple's walled garden, "we control everything" approach. But for better or worse, that's how the platform works, and we need to interoperate with the rest of the ecosystem.)

All of this behavior is described in the Khronos ICD Installation Guidelines. One section is particularly relevant, where it sets out the general guidelines for how and when new ICD loaders should be installed or uninstalled.

Vendor installers MUST install and uninstall their ICD-compliant implementations in such a way that the installer:

Installs its own ICD loader library if and only if the existing ICD loader library is older than the one being installed.

Does not remove ICD loader library at uninstall if other implementations exist.

Does not cause existing installations to become inoperable or unusable in any way. This includes, but is not limited to, WHQL and similar signed package certification check failures.

Does not manipulate the vendor enumeration order within the ICD loader library except to add (or remove) the new vendor implementation.

The important thing to notice here is that the conda-forge icd-ocl installer violates three of these four requirements (1, 3, and 4). As long as you view conda-forge as the entire universe, that isn't a problem. But as soon as you want to interoperate with anything else (either OpenCL implementations from other sources, or packages from other sources that use OpenCL), it creates serious problems.

isuruf commented 5 years ago

I don't see how 3 is violated. As long as the ocl-icd library is up-to-date in conda-forge, 1 is not violated either.

isuruf commented 5 years ago

4 is violated by the ICD loader by Khronos. https://github.com/KhronosGroup/OpenCL-ICD-Loader#table-of-debug-environment-variables. See OCL_ICD_VENDORS env variable.

jchodera commented 5 years ago

What @peastman is suggesting is to not use ICD loader and allow whatever ICD loader is in the system on Linux and to use only Apple implementation in OSX.

@isuruf: I believe @peastman suggested reversing the dependency policy:

I would suggest it should be a dependency for implementations of OpenCL, not for packages that use OpenCL.

Could you comment on this part of the proposal?

isuruf commented 5 years ago

What I don't get is, you are totally fine with the user installing an unsupoprted OpenCL implementation in the system, but against installing an unsupported OpenCL with conda.

peastman commented 5 years ago

Since that situation has never come up in 10 years, I'm not too worried about it. Users just don't install OpenCL implementations much, except the ones that come with their GPU drivers. Making them conda installable makes it much easier, and allows it to be invisibly installed without the user ever noticing, just because some unrelated package listed it as a dependency.

But let's get back to the specific proposals. My suggestions were

ocl-icd should be a dependency of packages that implement OpenCL, not ones that use OpenCL.
Mac packages should use OpenCL.framework unless the developer has a good reason not to.

To be sure, this isn't a complete solution for us. We'll need to take additional steps to protect against using unsupported OpenCL implementations (such as the run_constrained trick). But I'm not suggesting this just to make my own life easier. I believe it's overall a better policy and it will benefit other developers and users too.

isuruf commented 5 years ago

If you use the run_constrained trick, why not install ocl-icd? This has the added benefit of reproducability and also some nice features like OCL_ICD_DEBUG, OCL_ICD_VENDORS. Some ICD loaders on system supports these variables and some don't and using ocl-icd makes this consistent.

isuruf commented 5 years ago

Mac's OpenCL framework is deprecated and has been buggy since OSX 10.14. See https://github.com/inducer/pyopencl/issues/276.

jchodera commented 5 years ago

Mac's OpenCL framework is deprecated and has been buggy since OSX 10.14.

That looks to be accurate according to this official document:

isuruf commented 5 years ago

Btw, we now have ICDs for other video cards like Intel (intel-compute-runtime, beignet) and AMD (rocm-opencl-runtime). These are conda installable and will be visible only if you use ocl-icd.

jchodera commented 5 years ago

@peastman: Given the deprecation of the Apple's OpenCL framework on osx, wouldn't the provision of convenient installation support for Intel and AMD cards be critical to supporting our osx users?

peastman commented 5 years ago

OpenCL and OpenGL are both officially "deprecated" on Macs, but no one is really sure what that means. Apple hasn't shown any sign of removing either of them, and given how common OpenGL software is on Macs, it's hard to imagine them actually doing it. One theory I've heard is that this is in preparation for (someday, possibly) launching new Macs that use an Apple-developed, ARM based CPU and GPU. Currently Apple relies on the GPU vendors to write the OpenCL and OpenGL implementations, and they don't want to have to write their own once they're making the CPUs and GPUs themselves.

In any case, OpenCL.framework is still the universally used mechanism for writing OpenCL code on Macs. If that changes in the future, this policy should then be reevaluated. And of course, an ARM based Mac would require entirely different packages from an x86 based one.

peastman commented 5 years ago

(Of course, another popular theory is that Apple never plans to remove them. They're just trying to push people to use Metal in place of cross platform APIs.)

jchodera commented 5 years ago

Any other input from @conda-forge/core?

We'd love to move forward with finishing migration of OpenMM to conda-forge via https://github.com/conda-forge/openmm-feedstock, and resolving the best way to handle OpenCL support is our only remaining hurdle.

scopatz commented 5 years ago

@jchodera @peastman Can you provide a list of the dependencies that are missing on conda-forge right now and we can start figuring out how to provide them, either as real packages or as CDTs

peastman commented 5 years ago

Can you provide a list of the dependencies that are missing on conda-forge right now and we can start figuring out how to provide them, either as real packages or as CDTs

I'm not sure what you mean? This discussion isn't about missing dependencies.

scopatz commented 5 years ago

I thought the whole issue was about OpenCL not being available or selectable.

peastman commented 5 years ago

No, the issue is that we wouldn't want to use it even if it were. The user already has an OpenCL implementation that is either built into the OS (Mac) or installed with their video drivers (Linux, Windows). If we simply don't list ocl-icd as a dependency (we've never listed it as one before), everything works exactly as we want and there are no problems at all. The problem is that we've been told we must make it a dependency whether we want it or not!

scopatz commented 5 years ago

Yeah, it is a basic premis of conda-forge that all deps are packaged and available (see the staged recipes readme here). As a north star, all of our package dependncies are in our graph, one way or the other.

We know that this is hard, annoying, and difficult. We also know that this is what is fundementally best for reproducing environments in the long run and produces the most widely applicable redistrubutable binaries. In a very real sense, conda-forge is a community of people who have tackled these low-level build issues so that other people don't have to.

Hopefully this helps explain why we don't want to circumvenmt our own system. We'd love for OpenCL to be part of this system, to at least the same degree that the GPU packages are.

peastman commented 5 years ago

I think that still isn't really the issue here. Conda-forge isn't a self contained universe, and I don't see that it ever can be. It doesn't provide video drivers. It doesn't provide an OS kernel. It doesn't provide the filesystem or display manager or network drivers. It provides CUDA toolkits, but it doesn't provide the core CUDA runtime. That comes with the video driver, just like OpenCL. So why treat OpenCL differently from CUDA?

Anyway, even if you install ocl-icd, the user can still get access to system-wide installations of OpenCL by adding a symlink. So the "reproducible environment" question is already moot. Software on conda-forge already needs to deal with OpenCL implementations that came from outside, just as it needs to deal with outside video drivers and OS kernels, and different models of CPUs and GPUs. Environments will never be fully reproducible, and ocl-icd doesn't change that. It just creates extra hassles and more opportunities for things to go wrong.

Can we return to my specific proposal, that ocl-icd should be a dependency of packages that implement OpenCL rather than ones that use it? If you haven't conda installed an OpenCL implementation, there simply is nothing useful ocl-icd can do. If you've got a system-wide OpenCL installed, that's guaranteed to have provided a compatible ICD loader. If you don't, then there's nothing for it to load anyway. So why install it when it's guaranteed not to accomplish anything, and only creates opportunities for problems? And if you do install an OpenCL implementation from conda-forge, then ocl-icd will be installed too, and the situation will be identical to what it is now.

It seems to me this arrangement meets everyone's requirements. Can anyone point to a problem with it?

isuruf commented 5 years ago

So the "reproducible environment" question is already moot.

No, it's not. It's the user's choice to use the system packages. Also, I generally use OCL_ICD_VENDORS env variable which is available in ocl-icd to switch between implementations easily. This is not available in some of the other ICD loaders.

Let me use your arguments on zlib which is available on OSX. Why do we need zlib on OSX when it's installed on OSX? It doesn't accomplish anything and only creates opportunities for problems.

Also, it's a few lines of code in OpenMM to disallow other implementations. You and I have wasted way too much time discussing this than it takes to write those lines of code.

peastman commented 5 years ago

No, it's not. It's the user's choice to use the system packages

Or in this case, the developer's choice only to support those packages and not to support any others. So why are you so determined that I absolutely must included a dependency whose only function is to bring in packages I don't support? I really don't get it.

Why do we need zlib on OSX when it's installed on OSX?

I don't think that's an accurate analogy. The conda-forge zlib is basically identical to the system one. It's just recompiled from the same source code. It makes no difference to most users which one they use.

Here's an accurate analogy. Suppose you wrote a replacement for zlib. It had a compatible API, but it was a different codebase written by different people. Suppose it also had significant differences in behavior from standard zlib, and those differences caused serious problems for some applications. Would you then tell developers they were only allowed to use your zlib replacement and were forbidden to use standard zlib? Further suppose it was written in such a way that once it was installed, it would cause unrelated packages that used standard zlib to crash. That's pretty much the situation here.

You and I have wasted way too much time discussing this than it takes to write those lines of code.

I'm sorry you feel that way, but that isn't what this is about. I'm proposing a policy change that will be universally better for all users and developers. I still don't understand your objection to it. You still haven't pointed out a single problem with my proposal. Also, some aspects of this are impossible for me to work around in any way. For example, the crashes on Mac when trying to interoperate with standard OpenCL.

So let me ask this yet again: can you point out any problem with the policy I've proposed? It fixes some serious real world problems, and as far as I can tell it satisfies absolutely everyone's requirements.

isuruf commented 5 years ago

Here's an accurate analogy. Suppose you wrote a replacement for zlib. It had a compatible API, but it was a different codebase written by different people. Suppose it also had significant differences in behavior from standard zlib, and those differences caused serious problems for some applications. Would you then tell developers they were only allowed to use your zlib replacement and were forbidden to use standard zlib? Further suppose it was written in such a way that once it was installed, it would cause unrelated packages that used standard zlib to crash. That's pretty much the situation here.

Yes. This is exactly what we do with Accelerate vs OpenBLAS. We don't link to Accelerate at all.

isuruf commented 5 years ago

Your proposal imposes Apple implementation on everybody when I want to use Apple or POCL or oclgrind and using the ICD loader from conda-forge makes it a user choice.

jchodera commented 5 years ago

@peastman: What if we consider the conda-forge route as only one of multiple distribution channels? If we give the recommended approach (using ocl-icd) a try, if we do run into issues, we can revisit the policy issue to see if we can find away around these problems that ensures conda-forge consistency and compatibility. (And if there are users that have other needs, we can maintain a second distribution route to meet the needs of those users.)

@isuruf et al: There are bound to be some growing pains as we try to move our entire http://anaconda.org/omnia distribution channel over to the excellent conda-forge ecosystem you've built. Omnia has been going on for many years (even before conda-forge) because of the need for GPU-enabled builds, so we had some particular ideas about how to do this well to meet the scientific needs of our community, but we'll all have to do our best to make sure these communities can be merged harmoniously.

scopatz commented 5 years ago

Thanks @jchodera - I agree with your sentiments here completely. If you and @peastman ever what to jump on a call to talk about paths forward, I am happy to!

peastman commented 5 years ago

Your proposal imposes Apple implementation on everybody when I want to use Apple or POCL or oclgrind and using the ICD loader from conda-forge makes it a user choice.

No, it makes it a developer choice. Developers who want to support POCL and are willing to accept crashes when their software is used with other OpenCL-based packages can do so. My specific recommendation was that packages should use OpenCL.framework "unless the developer has a good reason not to."

it's important to recognize that OpenCL implementations are not interchangeable. Switching from one to another is a bit like switching to a new CPU architecture, a new OS, and a new compiler all at once. (Quite literally in fact, since each OpenCL implementation really does have its own compiler and different ones really do use different processors.) I've never yet tried a new OpenCL implementation and not had things immediately break in ways that required code changes. So I'm strongly in favor of developer choice on this.

What about the other part of my proposal: that ocl-icd should be a dependency of packages that implement OpenCL rather than ones that use it? No one has voiced any specific concerns with that. Are we at least agreed on that one?

What if we consider the conda-forge route as only one of multiple distribution channels?

Honestly? That sounds like a bit of a nightmare to me. :) Trying to maintain just one distribution channel is plenty of work as it is! Having to maintain multiple channels, and worrying about incompatibilities between them (especially when the different versions are explicitly different from each other and have different dependencies), and dealing with the different user issues from different distribution channels, does not sound like fun!

scopatz commented 5 years ago

What conda-forge does is "developer-choice" in that there are multiple artifacts that are compiled, one for each mutex of OpenCL that we will support. The user simply has the choice of which one to install.

peastman commented 5 years ago

Could you explain what you mean by that? Are you saying we would create one package called openmm-apple that used Apple's OpenCL, and another called openmm-pocl that used POCL?

scopatz commented 5 years ago

Yes that is more or less it. We'll then also have a meta-package that is just openmm that dispatches to the correct real package based on what the user has selected. This is basically the same as how our BLAS or MPI implementations work. We don't force the entire ecosystem into using OpenBLAS or MKL. Users can choose. See the docs at the following for a sense of how this works: https://conda-forge.org/docs/maintainer/knowledge_base.html?#blas

isuruf commented 5 years ago

@scopatz, there is user choice when using ocl-icd. You can install any OpenCL implementation (either using conda or using the system package manager). Remember that ocl-icd is an OpenCL ICD loader and can load any implementation.

peastman commented 4 years ago

What about the other part of my proposal: that ocl-icd should be a dependency of packages that implement OpenCL rather than ones that use it? No one has voiced any specific concerns with that. Are we at least agreed on that one?

Anyone?

isuruf commented 4 years ago

I disagree. If the ICD loader in the system is too old, there'll be symbol errors from the dependent packages (not the implementations)

isuruf commented 4 years ago

For example the symbol clSVMAlloc which is in OpenCL 2.0. Assume a dependent package uses this symbol after querying for OpenCL 2.0 support in the implementations. This dependent package will be able to use OpenCL 1.2 only implementations like NVIDIA by not using the clSVMAlloc functionality, by taking a different code path, but the symbol needs to be present in the ICD loader that this package links to.

jaimergp commented 4 years ago

@scopatz @isuruf, would it be possible to schedule a videocall with both of you (and more members from conda-forge/core if needed) to discuss this issue some time soon? That should accelerate the communication here so we can move forward!

I understand that we might be spread through several timezones, but we'll try to make it work (most of the involved parties seem to be US-based, though). How can I get in touch to discuss our availabilities?

Thank you!

isuruf commented 4 years ago

I'm sorry I don't have time for a video call.

scopatz commented 4 years ago

Yep, I am happy to have a call. We also have a dev call next wednesday at 5 PM UTC, if you want to join that

jaimergp commented 4 years ago

Sweet! Any chance you can share an email address so we can share a poll to find a suitable date and time?

jaimergp commented 4 years ago

Oh, and yes, @scopatz, I'll be happy to join your dev call!

isuruf commented 4 years ago

@jaimergp, next conda-forge dev call is on Wed Jan 29, 2020 at 11am – 11:45am (CST) and is hosted using https://zoom.us/

jaimergp commented 4 years ago

Perfect, thanks! Do we (@jchodera, @peastman and I) need to receive an invitation of some sort?

isuruf commented 4 years ago

Can you send the emails to @ericdill ?

loriab commented 4 years ago

I can send @jaimergp and co. the zoom link.

jaimergp commented 4 years ago

Oh, thanks! How can I send you the email addresses?

conda-forge / cfep

Policies for packages that use OpenCL #17