Frogging-Family / nvidia-all

Nvidia driver latest to 396 series AIO installer
790 stars 69 forks source link

Cannot build 535.43.02 #169

Closed BlueGoliath closed 1 year ago

BlueGoliath commented 1 year ago

Fails with: cannot stat 'libnvidia-compiler.so.535.43.02': No such file or directory.

gardotd426 commented 1 year ago

...isn't it 535.43.03 anyway?

gardotd426 commented 1 year ago

Oh wait, that's 530. Nevermind I'm stupid

gardotd426 commented 1 year ago

@BlueGoliath libnvidia-compiler.so has been removed:

  • Removed libnvidia-compiler.so.VERSION from the driver package. This functionality is now provided by other driver libraries.

You could probably just comment out the lines from the PKGBUILD, or change them to if -e statements the same way the libnvidia-compiler-next.so.$pkgver lines are done.

BlueGoliath commented 1 year ago

@BlueGoliath libnvidia-compiler.so has been removed:

  • Removed libnvidia-compiler.so.VERSION from the driver package. This functionality is now provided by other driver libraries.

You could probably just comment out the lines from the PKGBUILD, or change them to if -e statements the same way the libnvidia-compiler-next.so.$pkgver lines are done.

Looks like that worked. I'll keep the issue open until it's fixed properly.

gardotd426 commented 1 year ago

What monitoring utility do you use?

On Tue, May 30, 2023 at 7:16 PM Ty Young @.***> wrote:

OK, maybe something is wrong. I try to launch my Nvidia GPU monitoring utility and I either get:

malloc(): corrupted top size

or

malloc(): invalid size (unsorted)

depending on if I launch an old build or from my IDE. Probably an issue in NVML but who knows.

— Reply to this email directly, view it on GitHub https://github.com/Frogging-Family/nvidia-all/issues/169#issuecomment-1569260846, or unsubscribe https://github.com/notifications/unsubscribe-auth/AM5Y337M25TE7MPPVNNCTD3XIZ5ULANCNFSM6AAAAAAYUT7WIM . You are receiving this because you commented.Message ID: @.***>

BlueGoliath commented 1 year ago

My own. I found the issue. Someone at Nvidia goofed.

https://forums.developer.nvidia.com/t/nvml-12-535-43-02-breaks-backwards-compatibility/254999

gardotd426 commented 1 year ago

I added a comment seconding what you said. If they don't respond by like tomorrow I would send an email, I've always had better luck going that route (or doing both, it seems I usually get a rather quick response doing that.

On Tue, May 30, 2023 at 10:22 PM Ty Young @.***> wrote:

My own. I found the issue. Someone at Nvidia goofed.

https://forums.developer.nvidia.com/t/nvml-12-535-43-02-breaks-backwards-compatibility/254999

— Reply to this email directly, view it on GitHub https://github.com/Frogging-Family/nvidia-all/issues/169#issuecomment-1569401689, or unsubscribe https://github.com/notifications/unsubscribe-auth/AM5Y337UE5HBIBH5CD5QUDTXI2TMVANCNFSM6AAAAAAYUT7WIM . You are receiving this because you commented.Message ID: @.***>

notpentadactyl commented 1 year ago

Saancreed also mentioned that there is a new file nvoptix.bin that needs to be packaged.

gardotd426 commented 1 year ago

Yeah, I knew that but forgot to mention it, but unfortunately I don'tt believe it will fix BlueGoliath's problem. Hopefully Nvidia doesn't take forever to fix it. It'd be nice to know whether things like GWE still work, I haven't installed the drivers yet.

On Wed, May 31, 2023 at 5:56 AM notpentadactyl @.***> wrote:

Saancreed also mentioned that there is a new file nvoptix.bin that needs to be packaged.

— Reply to this email directly, view it on GitHub https://github.com/Frogging-Family/nvidia-all/issues/169#issuecomment-1569873446, or unsubscribe https://github.com/notifications/unsubscribe-auth/AM5Y337YKSJN4GPDDSUOM4LXI4ITRANCNFSM6AAAAAAYUT7WIM . You are receiving this because you commented.Message ID: @.***>

BlueGoliath commented 1 year ago

Yeah, I knew that but forgot to mention it, but unfortunately I don'tt believe it will fix BlueGoliath's problem. Hopefully Nvidia doesn't take forever to fix it. It'd be nice to know whether things like GWE still work, I haven't installed the drivers yet.

To clarify, it's only an issue if you use a function that takes in nvmlProcessInfo_t struct type. I've fixed it on my end and my app works again. Everything else works fine in NVML as it did before for now. Someone might want to maintain GWE again though.

gardotd426 commented 1 year ago

To clarify, it's only an issue if you use a function that takes in nvmlProcessInfo_t struct type. I've fixed it on my end and my app works again. Everything else works fine in NVML as it did before for now. Someone might want to maintain GWE again though.

I will say that I installed the drivers (including nvoptix.bin which goes in /usr/share/nvidia/) and tested GWE and it worked perfectly fine. I couldn't use those drivers though because they break CUDA completely. At least they break a lot of it. Blender completely fails to even run any GPU benchmarks and when trying to render a .blend file it crashes with weird errors, I downgraded back to the Vulkan beta driver and the issues are all gone. 530.41.03 worked fine too.

@BlueGoliath can you try and run blender-benchmark and do a GPU bench and see if you get the same errors? I reported it on the NV forums but haven't had any responses yet.

ryanmusante commented 1 year ago

This is the driver mentioned here? https://www.phoronix.com/news/NVIDIA-535.43.02-Linux-Driver

gardotd426 commented 1 year ago

Yes.

On Wed, May 31, 2023 at 8:35 PM Ryan Musante @.***> wrote:

This is the driver mentioned here? https://www.phoronix.com/news/NVIDIA-535.43.02-Linux-Driver

— Reply to this email directly, view it on GitHub https://github.com/Frogging-Family/nvidia-all/issues/169#issuecomment-1571142639, or unsubscribe https://github.com/notifications/unsubscribe-auth/AM5Y337LWD36J3ZEDIUUFEDXI7PWXANCNFSM6AAAAAAYUT7WIM . You are receiving this because you commented.Message ID: @.***>

notpentadactyl commented 1 year ago

@BlueGoliath can you try and run blender-benchmark and do a GPU bench and see if you get the same errors? I reported it on the NV forums but haven't had any responses yet.

doesn't work for me either:

image

fidasx commented 1 year ago

same here blender doesn't work when trying to render it crashes with error

munmap_chunk(): invalid pointer

other 3d utils like glmark or vkcube work fine

gardotd426 commented 1 year ago

It's CUDA that seems to be the problem, GLMark and VKCube don't use any CUDA/Optix stuff.

ryanmusante commented 1 year ago

535.42.02 is a beta, unless you want to spend considerable amount of time on nvidia dev forums troubleshooting, I'd stick to a working ver. I ran into issues aplenty and rolled back.

gardotd426 commented 1 year ago

@ryanmusante Yeah. We're aware it's a beta. We run these things for a reason, because betas need to be tested. Without people who are willing to test beta releases, betas have no reason to exist. If that's not you, then don't run it.

ryanmusante commented 1 year ago

@gardotd426 I should at least open one ticket with nvidia to let them know about my particular configuration and show them the logs. It's just that it becomes incredibly time consuming. I am willing to an extent. But I'm an X11 user and I believe most people, devs included are more interested in Wayland users.

gardotd426 commented 1 year ago

It's not really that time consuming, at all.

I opened a thread regarding the CUDA Blender bug, and an Nvidia employee had replied within 24 hours with an internal bug tracking number, and the same day they reported that they'd found what the issue was and what I could do in the meantime to workaround it.

It's not always that easy, but if it's an actual bug, and you take ten minutes to adequately describe it and include an nvidia-bug-report.log.gz or whatever, then that's usually all you need to do.

BlueGoliath commented 1 year ago

535.42.02 is a beta, unless you want to spend considerable amount of time on nvidia dev forums troubleshooting, I'd stick to a working ver. I ran into issues aplenty and rolled back.

If I was just a normal driver user I would do that but 535(and Windows equivalent) has a bunch of new and exciting features and those take priority for me.

I'm going to close this since it seems to have been resolved.