AcademySoftwareFoundation / OpenShadingLanguage

Advanced shading language for production GI renderers
BSD 3-Clause "New" or "Revised" License
2.05k stars 348 forks source link

LLVM 16 and opaque pointers support #1727

Closed brechtvl closed 10 months ago

brechtvl commented 10 months ago

Description

LLVM 16 was released a while ago but OSL does not support it yet. The main challenge is that this version requires opaque pointers, that is you can no longer tell from a pointer value which data type it points to.

Changes made:

Remaining work:

Tests

No new tests as the whole testsuite covers this.

Some tests are failing currently as this is work in progress.

Checklist:

brechtvl commented 10 months ago

Maybe a better solution would be to introduce an OSL non-opaque pointer type that wraps an llvm::Value* and llvm::Type* for the data it points to. That would sidestep the need to figure out the right types for batch mode types and llvm_store_value.

The full consequences of adding that additional abstraction to LLVM_Util are not clear to me, if perhaps using higher level abstractions than the LLVM API will get us into trouble later.

AlexMWells commented 10 months ago

Maybe a better solution would be to introduce an OSL non-opaque pointer type that wraps an llvm::Value* and llvm::Type* for the data it points to.

If llvm's point of view are that type's don't matter for pointers, well something higher needs to actually track them. So an having a wrapper of those 2 seems like a very reasonable approach to be able to ease into the transition.

lgritz commented 10 months ago

What I was thinking is that LLVM_Util::op_load would be augmented with another parameter giving the type that we expect, and everyplace we call op_load, we would pass the expected type. For old LLVM versions, it would just check that they matched, for new LLVM that would be what the opaque pointer would need to know what it is.

So the first step -- add the type paramter to op_load, pass it everywhere it's needed, check that it's right -- can be done first, independently, to get the whole thing plumbed to throw the switch for opaque pointers.

brechtvl commented 10 months ago

@lgritz there is already a version of op_load (and GEP) with a type, and for non-batched code it was relatively straightforward to use them everywhere.

For the batched code I'm having difficulty finding what the right types are just from the context of each function. When tracing further backwards I can't see clearly when a type is wide or not, or when it's a wide bool or integer mask. It may just be my unfamiliarity with the code and @AlexMWells can tell quickly.

I left TODO comments in the code where I was uncertain.

brechtvl commented 10 months ago

I got batched mode tests to pass as well now on my system. OptiX likely still broken due to the casting issue mentioned in the description.

Other changes:

brechtvl commented 10 months ago

Which version of clang-format should I be using? I tried 12, 15 and 16 and none matched exactly.

lgritz commented 10 months ago

Our CI job uses clang-format 14

brechtvl commented 10 months ago

It turns out there is another significant change in LLVM 16, where support for default optimization levels (O0, O1, O2, O3) was removed from the legacy pass manager.

I added support for the new pass manager to address that. For the custom optimization levels 10-13 it still uses the legacy pass manager since I did not implement those. And of course the performance impact of this should be checked.

It's probably time to break up this PR into multiple smaller ones.

lgritz commented 10 months ago

You're a big hero for all of this, Brecht.

Yeah, the more bite sized chunks, the better. Maybe the order should be:

  1. Switch to new pass manager when possible.
  2. The changes that allow us to build against LLVM16 but still using typed pointers (I think 16 still allows it, optionally, but in 17 it's gone for good).
  3. Switch to being able to use opaque pointers.
lgritz commented 10 months ago

@brechtvl, We currently claim to support as far back as llvm 9, but such a wide range is really not necessary (and perhaps not desirable, because it's too much trouble to always test against them all). I would prefer (if possible) to preserve the ability to build against several old versions (i.e., not force us to move our minimum all the way up to llvm 16), but at the same time, if it helps you to bring the floor up somewhat, you should definitely feel free to do so.

brechtvl commented 10 months ago

I don't expect any of these changes to require raising the floor.

brechtvl commented 10 months ago

The changes that allow us to build against LLVM16 but still using typed pointers (I think 16 still allows it, optionally, but in 17 it's gone for good).

I was hoping the non-opaque pointers would still work, but I got this error loading the bitcode. Even though -Xclang -no-opaque-pointers is passed when compiling the bitcode.

Opaque pointers are only supported in -opaque-pointers mode (Producer: 'LLVM16.0.0' Reader: 'LLVM 16.0.0')

I guess it's not too surprising given the LLVM docs:

LLVM 15: Opaque pointers are enabled by default. Typed pointers are still supported. LLVM 16: Opaque pointers are enabled by default. Typed pointers are supported on a best-effort basis only and not tested. https://llvm.org/docs/OpaquePointers.html

brechtvl commented 10 months ago

This is now replaced by #1728, #1729 and #1730.

lgritz commented 10 months ago

Is there a preferred order in which we review and integrate them?

brechtvl commented 10 months ago

For reviewing the order does not matter. For merging, #1730 should be done last.

brechtvl commented 10 months ago

This PR is closed and outdated now, please see #1728 instead.