AcademySoftwareFoundation / OpenShadingLanguage

Advanced shading language for production GI renderers
BSD 3-Clause "New" or "Revised" License
2.07k stars 350 forks source link

Fix missing fmod variants #1643

Closed lgritz closed 1 year ago

lgritz commented 1 year ago

Fixes #1526 (maybe?)

Signed-off-by: Larry Gritz lg@larrygritz.com

lgritz commented 1 year ago

@MasterZap Are you able to apply this patch on your end and see if it clears up the problem? I'm still having trouble directly reproducing, I think it may only be a problem on Windows.

AlexMWells commented 1 year ago

If if that fixes it, it doesn't answer the question of how/why the "float" is not being properly promoted?

lgritz commented 1 year ago

why should it?

lgritz commented 1 year ago

stdosl.h:118 is why

lgritz commented 1 year ago

I should fix the docs to make it more clear that fmod (like pow!) allows the second argument to be float even if the first arg is a triple.

AlexMWells commented 1 year ago

That can be traced back to https://github.com/AcademySoftwareFoundation/OpenShadingLanguage/commit/ca0c6df4e030f45c0422461754aead1149287590 Fix slightly broken overload resolution which added handling of float mixing with Vec3 based types. But no documentation change.

AlexMWells commented 1 year ago

Can we get some testsuite entries added to test these combinations and try them in batched as well?

I see a good example of testing the combinations for pow in testsuite/transcendental-reg

which exercises "../common/shaders/test_binary_mixed_triple_float_xmacro.h"

whereas the fmod tests in testsuite/arithmetic-reg only exercises

include "../common/shaders/test_binary_xmacro.h"

which does not include the Vec3%float variants.

If you update those tests to be on par with the pow tests, I think you will have your reproducer. Then will be able to add the necessary implementations to support it.

In particular src/liboslexec/wide/wide_opalgebraic.cpp will need to have more variations added for fmod

MasterZap commented 1 year ago

@lgritz I will try to test this on Monday.

lgritz commented 1 year ago

As Alex points out, this is not a complete solution. But if it changes the symptoms, it definitely means I'm on the right track.

MasterZap commented 1 year ago

Tested it - seems to work!!

Thanks so much for this ♥️

/Z

lgritz commented 1 year ago

Cool, you can try that patch on your end, I guess it can't make it any worse. But as Alex said, there's more work to make it fully complete for the batch case.

And also, I think Chris has a point. Looking at the implementation, I think there is no real runtime computation to be saved by having the (triple,float) special cases. I believe just allowing oslc to promote the (triple,float) case to (triple,triple) has no real penalty and is a simple solution, the only being that previously-compiled shaders would need to be recompiled. But as far as I know, you're the only person to ever report encountering the bug in practice, so maybe that's adequate. It's something I'm mulling over today.

lgritz commented 1 year ago

@MasterZap Would you mind doing another experiment for me?

In llvm_gen.cpp, circa line 756, you'll see

#ifdef OSL_LLVM_NO_BITCODE
    // On Windows 32 bit this calls an unknown instruction, probably need to
    // link with LLVM compiler-rt to fix, for now just fall back to op
    if (is_float)
        return llvm_gen_generic(rop, opnum);
#endif

I'm kind of suspecting that these lines, which are 10 years old, may be totally irrelevant and reflects something that has long since been fixed on the LLVM side. Some things can easily be simplified if this is no longer needed.

Can you please simply comment out just these lines entirely? I don't have a good way of testing on Windows, but if things seem fine for you with this clause gone, that will change how I want to approach the full solution to this.

lgritz commented 1 year ago

Updated with a full working solution (including batch) based on Alex's suggestion, including full regression tests. Passes now.

I'm awaiting Zap's test that I suggested above, and pending that I'll probably have one more cleanup/simplification round.

MasterZap commented 1 year ago

Those lines, are that specifically on windows 32 BIT as you wrote? We are all 64 bit, and really have no way to test 32 bit. And if it really only is for 32 bit then... you can probably ignore it :)

lgritz commented 1 year ago

Under the paranoid assumption that all comments are not necessarily 100% correct, I guess I wanted to verify empirically that this was not a concern with today's LLVM on Windows in practice before I ripped it out completely.

MasterZap commented 1 year ago

I have not had time to test deleting this code yet, hope to soon.

lgritz commented 1 year ago

In the interest of having a working solution in time for the 1.12 patch release this week, I'm inclined to merge and backport this now, and then later I can follow up with a separate PR that removes the parts that we think are only needed for the old-LLVM versions we no longer support (perhaps putting it only in main and not backporting, for additional safety).