Bumblebee-Project / Bumblebee

Bumblebee daemon and client rewritten in C
http://www.bumblebee-project.org/
GNU General Public License v3.0
1.29k stars 142 forks source link

Fedora 19, CUDA, nvidia-319 #453

Closed MrSampson closed 9 years ago

MrSampson commented 11 years ago

Howdy, I'm trying to get CUDA installed on my Fedora 19 Bumblebee system and I've bumped into something I don't know how to get around:

I installed the not-in-the-repository version of bumblebee-nvidia-319 (which CUDA 5.5 requires), and so far that's working well. (I don't usually need to run anything using optirun.)

So, when I try to yum the CUDA pakages, I get


$ sudo yum install cuda Loaded plugins: langpacks, refresh-packagekit Resolving Dependencies --> Running transaction check ---> Package cuda.x86_64 0:5.5-22 will be installed --> Processing Dependency: cuda-5-5 = 5.5-22 for package: cuda-5.5-22.x86_64 --> Running transaction check ---> Package cuda-5-5.x86_64 0:5.5-22 will be installed --> Processing Dependency: cuda-command-line-tools-5-5 = 5.5-22 for package: cuda-5-5-5.5-22.x86_64 --> Processing Dependency: cuda-headers-5-5 = 5.5-22 for package: cuda-5-5-5.5-22.x86_64 --> Processing Dependency: cuda-documentation-5-5 = 5.5-22 for package: cuda-5-5-5.5-22.x86_64 --> Processing Dependency: cuda-samples-5-5 = 5.5-22 for package: cuda-5-5-5.5-22.x86_64 --> Processing Dependency: cuda-visual-tools-5-5 = 5.5-22 for package: cuda-5-5-5.5-22.x86_64 --> Processing Dependency: cuda-core-libs-5-5 = 5.5-22 for package: cuda-5-5-5.5-22.x86_64 --> Processing Dependency: cuda-license-5-5 = 5.5-22 for package: cuda-5-5-5.5-22.x86_64 --> Processing Dependency: cuda-core-5-5 = 5.5-22 for package: cuda-5-5-5.5-22.x86_64 --> Processing Dependency: cuda-misc-5-5 = 5.5-22 for package: cuda-5-5-5.5-22.x86_64 --> Processing Dependency: cuda-extra-libs-5-5 = 5.5-22 for package: cuda-5-5-5.5-22.x86_64 --> Processing Dependency: xorg-x11-drv-nvidia-devel(x86-32) >= 319.00 for package: cuda-5-5-5.5-22.x86_64 --> Processing Dependency: xorg-x11-drv-nvidia-libs(x86-32) >= 319.00 for package: cuda-5-5-5.5-22.x86_64 --> Processing Dependency: xorg-x11-drv-nvidia-devel(x86-64) >= 319.00 for package: cuda-5-5-5.5-22.x86_64 --> Processing Dependency: nvidia-xconfig >= 319.00 for package: cuda-5-5-5.5-22.x86_64 --> Processing Dependency: cuda-driver >= 319.00 for package: cuda-5-5-5.5-22.x86_64 --> Processing Dependency: nvidia-settings >= 319.00 for package: cuda-5-5-5.5-22.x86_64 --> Processing Dependency: nvidia-modprobe >= 319.00 for package: cuda-5-5-5.5-22.x86_64 --> Running transaction check ---> Package cuda-command-line-tools-5-5.x86_64 0:5.5-22 will be installed ---> Package cuda-core-5-5.x86_64 0:5.5-22 will be installed ---> Package cuda-core-libs-5-5.x86_64 0:5.5-22 will be installed ---> Package cuda-documentation-5-5.x86_64 0:5.5-22 will be installed ---> Package cuda-extra-libs-5-5.x86_64 0:5.5-22 will be installed ---> Package cuda-headers-5-5.x86_64 0:5.5-22 will be installed ---> Package cuda-license-5-5.x86_64 0:5.5-22 will be installed ---> Package cuda-misc-5-5.x86_64 0:5.5-22 will be installed ---> Package cuda-samples-5-5.x86_64 0:5.5-22 will be installed ---> Package cuda-visual-tools-5-5.x86_64 0:5.5-22 will be installed ---> Package nvidia-modprobe.x86_64 0:319.37-1.fc18 will be installed ---> Package nvidia-settings.x86_64 0:319.37-30.fc18 will be installed ---> Package nvidia-xconfig.x86_64 0:319.37-27.fc18 will be installed ---> Package xorg-x11-drv-nvidia.x86_64 1:319.37-2.fc18 will be installed --> Processing Dependency: xorg-x11-drv-nvidia-libs(x86-64) = 1:319.37-2.fc18 for package: 1:xorg-x11-drv-nvidia-319.37-2.fc18.x86_64 --> Processing Dependency: nvidia-kmod >= 319.37 for package: 1:xorg-x11-drv-nvidia-319.37-2.fc18.x86_64 ---> Package xorg-x11-drv-nvidia-devel.i686 1:319.37-2.fc18 will be installed ---> Package xorg-x11-drv-nvidia-devel.x86_64 1:319.37-2.fc18 will be installed ---> Package xorg-x11-drv-nvidia-libs.i686 1:319.37-2.fc18 will be installed --> Running transaction check ---> Package nvidia-kmod.x86_64 1:319.37-1.fc18 will be installed ---> Package xorg-x11-drv-nvidia-libs.x86_64 1:319.37-2.fc18 will be installed --> Processing Conflict: bumblebee-nvidia-319.23-1.fc19.x86_64 conflicts xorg-x11-drv-nvidia --> Processing Conflict: bumblebee-nvidia-319.23-1.fc19.x86_64 conflicts xorg-x11-drv-nvidia-libs --> Processing Conflict: bumblebee-nvidia-319.23-1.fc19.x86_64 conflicts xorg-x11-drv-nvidia-libs --> Processing Conflict: bumblebee-nvidia-319.23-1.fc19.x86_64 conflicts nvidia-x11-drv-32bit --> Processing Conflict: bumblebee-nvidia-319.23-1.fc19.x86_64 conflicts nvidia-x11-drv --> Processing Conflict: bumblebee-nvidia-319.23-1.fc19.x86_64 conflicts nvidia-settings --> Processing Conflict: bumblebee-nvidia-319.23-1.fc19.x86_64 conflicts nvidia-xconfig --> Finished Dependency Resolution Error: bumblebee-nvidia conflicts with nvidia-settings-319.37-30.fc18.x86_64 Error: bumblebee-nvidia conflicts with 1:xorg-x11-drv-nvidia-libs-319.37-2.fc18.x86_64 Error: bumblebee-nvidia conflicts with 1:xorg-x11-drv-nvidia-libs-319.37-2.fc18.i686 Error: bumblebee-nvidia conflicts with nvidia-xconfig-319.37-27.fc18.x86_64 Error: bumblebee-nvidia conflicts with 1:xorg-x11-drv-nvidia-319.37-2.fc18.x86_64 You could try using --skip-broken to work around the problem \ Found 2 pre-existing rpmdb problem(s), 'yum check' output follows: bumblebee-3.2.1-4.fc19.x86_64 is a duplicate with bumblebee-3.2.1-2.fc19.x86_64


I don't know if skipping the broken packages is the right thing here. Is there something else going on with the packaging? Any tips?

Thanks, Oliver

gsgatlin commented 11 years ago

It looks like the cuda packages assume you are using the nvidia drivers from rpmfusion.

Unfortunately, those packages don't work with bumblebee. I am sorry.

Perhaps you could not use yum and instead use a combination of

yumdownloader

followed by rpm commands.

rpm -ivh --nodeps cuda-5-5-5.5-22.x86_64.rpm b.rpm c.rpm ...

Alternatively, perhaps there is some non rpm way to install the software? Or if source rpms are available for cuda the spec file could be edited so that it depends on "bumblebee-nvidia" instead of "xorg-x11-drv-nvidia-libs" and others.

Sorry I can't think of anything else.

gsgatlin commented 11 years ago

As a followup, I think you may have better luck with the "run" version for fedora 18 from

from https://developer.nvidia.com/cuda-downloads

We installed this into a "afs" file locker where I work. (But using the RHEL 6 run version since that is the distro we use for now)

MrSampson commented 11 years ago

Thanks for the tips. What's the difference between the rpmfusion and bumblebee versions of the nvidia drivers?

If it's just a matter of the cuda packages not knowing about the bumblebee naming, but all of the binaries being available, that's one thing. It's another if installing with skip-broken breaks the install.

gsgatlin commented 11 years ago

The difference is that bumblebee-nvidia is a shell script that installs the nvidia video drivers in such a way that they do not break mesa on intel. You can see how it works by looking at the /usr/sbin/bumblebee-nvidia shell script with a pager such as less.

(If other people are curious here it is on pastebin: http://pastebin.com/XEzxAdjv )

Some operating system files and symlinks needs to be protected from the installer trying to overwrite them with its own files. Like one of these files is /usr/lib64/libGL.so.1.2.0. There are many others such as the primus libGL.

The nvidia drivers from rpmfusion work good for desktop type systems but they don't work correctly with optimus laptops. The last time I tried installing the driver from rpmfusion my screen went completely black which makes working on it hard or impossible. So I would advise against using those packages with a optimus laptop. All the binaries are available in both rpmfusion packages and the bumblebee-nvidia package. skip-broken should not break it. But I admit I have not tried because I use this technology only for gaming and not cuda programming.

Also, the reason why I did not update bumblebee-nvidia to a newer version of the drivers has to do with issue #433 . But if you don't care about the driver not unloading after one use you can try newer versions. The shell script just expects a single blob + any required patches to be places in the directory /etc/sysconfig/nvidia/

Any patches must end in .patch in the filename.

So if you needed the latest version for some reason you could always move whatever came in the bumblebee-nvidia package out of the way and download whatever version of the blob you want to install on your fedora system.

The rpmfusuion packages are a combination of "kmod" packages for the kernel and regular rpm packages and they only use the nvidia blob at rpm build time. They have all the files embeded within the rpm package and all its sub packages.

bumblebee-nvidia just has a shell script, a selinux security policy, and some init scripts to run the shell script at boot. And it contains the nvidia blob also. But the files the installer installs are outside of the rpm packaging system. But everything should be there. You can verify this by running "optirun -b none nvidia-settings -c :8" to run the nvida-settings program. The shell script is "smart enough" to only compile the drivers when it needs to. Which is when a new kernel is installed or a flag is set by upgrading the package.

Hope that helps.

MrSampson commented 11 years ago

Thanks for the detailed reply! I have no experience doing any packaging, but I reckon I could spend some time figuring it out.

I haven't tried the install with the skip-broken, because of exactly the type of experience you had when the screen went blank. Using a machine without a monitor is exceedingly difficult.

If I understood you correctly, the problem is that the rpmfusion packages overwrite something that the the Intel drivers need, and the bumblebee-nvidia package just installs the nvidia drivers while protecting the files that the Intel driver needs.

Is that what makes it impossible to write an install script that builds on top of the rpmfusion nvidia drivers?

Ideally in this case, the packaging architecture would have the rpmfusion nvidia packages install, and then have the bumblebee packages install after them. Would it be possible to get the rpmfusion nvidia package maintainer to change the script so that bumblebee could be built on top of it and both packages could coexist peacefully?

gsgatlin commented 11 years ago

If I knew how to make it work, I think the best way forward would be for the rpmfusion nvidia driver packages to support a desktop with nvidia card as the boot device, a laptop with optimus, or a PC with one or more "tesla" card in it and no DVI or VGA ports in the tesla card(s). (So you are using some other graphics card like an onboard VGA of some sort, not necessarily intel. It might be a server motherboard with something exotic for example...) The older tesla cards are like that. The new ones have DVI ports IIRC. I believe these nvidia tesla cards are used a lot by cuda programmers actually...

Currently the rpmfusion drivers package only works with a desktop system with nvidia as the boot vga device. It does not work with scenario 2 (optimus) or 3 (tesla card without DVI port).

Unfortunately, I have not had time to figure out how to make it work yet. I think I could borrow a tesla card from this grad student I know at my job. But I did not ask him yet. I was barely able to figure out how to make it work on a optimus laptop with the help of a kind person here on github and the hybid graphics mailing list. Joaquín Aramendía (A.K.A. @Samsagax ) gave me hints and scripts from the archlinx distro back in 2012 and so my script was kind of based on that at the beginning. But I had to keep changing things for fedora as fedora changed. For example, my script tries to shuffle all the nvidia files into their own sub-directory by the use of command line options to the installer. The sub-directory was required beginning with fedora 18 for it to even work.

In order for a rpm package to modify another rpm package it kind of has to be built in through a trigger. So both packagers need to work together on it. But I'm not even at the point where I could make a bugzilla. :(

I think the main problem is that nvidia needs to overwite one (or more maybe?) libraries when it is the VGA boot device. And it needs to have a specially crafted xorg.conf for some reason. But when nvida is not the boot device, such as in a optimus laptop, it needs to not have the special xorg.conf and it needs to NOT have overwritten the libraries like /usr/lib64/libGL.so.1.2.0.

So I'm thinking there needs to be like a boot up service (systemd unit file and script or program) that checks to see if the nvida card was the boot vga device. If so, then make or copy a xorg.conf to the right place and copy over the OS library(s) with the nvidia one(s). If not, then delete or copy the original xorg.conf (If a user even has one) and restore the original OS mesa library files before gdm, kdm, etc startup.. But I haven't figure out how to make it work with a desktop yet. I only spent about half a day on that problem so far because I have been busy. (This weekend, primus rpm package update last night, updating bumblebee-nvidia today for f18 getting a 3.10 kernel, so a patch has to be added for f18, not just f19, and also fixing a bug in the script that only happens when it fails... and $DAYJOB is very busy these days)

It turns out that the boot vga flag is in the /sys/bus/pci/devices//boot_vga file.

I think 0x10de in subsystem_vendor in whatever directory corresponds to NVIDIA Corporation. So a script or program needs to use that info to decide how to behave.

Also, there is the question of will nvidia be fixing these problems themselves in their installer as we get closer to having PRIME work? I wish I had done a better job on some of this stuff but honestly, its a bit beyond me at times. Hopefully things will slow down in a month or two when I might find time to really look at it in depth.

Of course I'll be happy to work on it with anyone reading this who is a fedora / RHEL user/packager and is frustrated by the current situation. The sources for everything in my repo are available for anyone who wants to see the gory details and even make changes to make it better..

I think the steps might be:

  1. figure out how to make "bumblebee-nvidia" work on a desktop AND a optimus laptop.
  2. Once you are past step one, examine the source rpms for the rpmfusion drivers and make any required changes so that they would have the files spread out in a similar fashion to bumblebee-nvidia but still have the bits needed to work with a desktop if the proper boot conditions are met.
  3. Make a new bugzilla at rpmfusion suggestion how it should be changed. Perhaps provide them with patches to the spec file and so forth.
  4. Retire "bumblebee-nvidia" once the rpmfusion drivers "just work" on all systems. Perhaps make a "empty" version of it available at my repo for people who want/need to compile their own versions of blobs for some reason. (Where you would have to download whatever version of the blob you want yourself)

Right now everything expects a /usr/lib64/nvidia-bumblebee/ but that could be turned into a symlink to something like /usr/lib64/nvidia to make it a more generic solution.

Cheers,

gsgatlin commented 11 years ago

Actually, after chatting with @amonakov on IRC, it may be better to have the libGL.so.* and libglx.so from mesa-libGL package and xorg-x11-server-Xorg package be moved out of the way to be replaced by symlinks. This is how it is done currently on archlinux and gentoo I have learned.

So then the nvidia versions of these libraries can go in /usr/lib64/nvidia and the symlink can be changed with a trigger. Perhaps by a mostly empty "bumblebee-nvidia" package.

I will try to file bugs against these two fedora packages in the coming week. (mesa package and xorg-x11-server are the parent packages for both of those files) And we'll see what the maintainers think about this idea. Depending on that we could approach the rpmfusion folks.

MrSampson commented 11 years ago

Well, I have absolutely zero experience with packaging, but I do have access to a system with a Tesla card. (I think it's running Ubuntu.) That would be great if you would file bugs against those packages, and get the maintainers in the loop. I'll have some time next week to see if I can make sense of what's going on.

abidrahmank commented 10 years ago

I have the same issue on my fedora 18 machine. Any update on this issue?

I tried to remove 3.2.1.2, but can't remove it. Below is what I got:

[root@localhost abidrahmank]# package-cleanup --cleandupes
Transaction Test Succeeded
Running Transaction
/var/tmp/rpm-tmp.fzwuUW: line 3: syntax error near unexpected token 'fi'
/var/tmp/rpm-tmp.fzwuUW: line 3: 'fi'
error: %preun(bumblebee-3.2.1-2.fc18.x86_64) scriptlet failed, exit status 2
Error in PREUN scriptlet in rpm package bumblebee-3.2.1-2.fc18.x86_64
  Verifying  : bumblebee-3.2.1-2.fc18.x86_64                                                                                                      1/1 

Failed:
  bumblebee.x86_64 0:3.2.1-2.fc18                                                                                                                     

Complete!
CyrusYzGTt commented 10 years ago

于 2013年09月28日 14:49, Abid K 写道:

I have the same issue on my fedora 18 machine. Any update on this issue?

I tried to remove 3.2.1.2, but can't remove it. Below is what I got:

|[root@localhost abidrahmank]# package-cleanup --cleandupes Transaction Test Succeeded Running Transaction /var/tmp/rpm-tmp.fzwuUW: line 3: syntax error near unexpected token 'fi' /var/tmp/rpm-tmp.fzwuUW: line 3: 'fi' error: %preun(bumblebee-3.2.1-2.fc18.x86_64) scriptlet failed, exit status 2 Error in PREUN scriptlet in rpm package bumblebee-3.2.1-2.fc18.x86_64 Verifying : bumblebee-3.2.1-2.fc18.x86_64 1/1

Failed: bumblebee.x86_64 0:3.2.1-2.fc18

Complete! |

  • Even re-install is also not possible.
  • Why bumblebee-nvidia is not updated to 319 in fedora 18?

— Reply to this email directly or view it on GitHub https://github.com/Bumblebee-Project/Bumblebee/issues/453#issuecomment-25292902.

rpm -e --noscripts |bumblebee-3.2.1-2.fc18.x86_64|

abidrahmank commented 10 years ago

Thank you.. I will look into it.. But just after posting above comment and restarting my machine, I can't get into X Windows. I booted in text-only mode and entering startx gave me errors.

Below is my xorg.conf file details: https://gist.github.com/abidrahmank/6739832 Below is my error log : https://gist.github.com/abidrahmank/6739844

CyrusYzGTt commented 10 years ago

于 2013年09月28日 16:22, Abid K 写道:

Thank you.. I will look into it.. But just after posting above comment and restarting my machine, I can't get into X Windows. I booted in text-only mode and entering |startx| gave me errors.

Below is my xorg.conf file details: https://gist.github.com/abidrahmank/6739832 Below is my error log : https://gist.github.com/abidrahmank/6739844

— Reply to this email directly or view it on GitHub https://github.com/Bumblebee-Project/Bumblebee/issues/453#issuecomment-25294119.

yum reinstall mesa-* && yum install gdm.x86_64 (if x86_64 )

abidrahmank commented 10 years ago

It didn't work.. Still same error. I think my xorg.conf file is somehow changed...

CyrusYzGTt commented 10 years ago

于 2013年09月28日 20:43, Abid K 写道:

It didn't work.. Still same error. I think my xorg.conf file is somehow changed...

— Reply to this email directly or view it on GitHub https://github.com/Bumblebee-Project/Bumblebee/issues/453#issuecomment-25297438.

no xorg.conf in /etc/X11/ only xorg.conf.nvidia xorg.conf.nouveau .. if you use bumblebee or no xorg.conf* ls /etc/X11/ applnk fontpath.d xinit Xmodmap xorg.conf.d Xresources

gsgatlin commented 10 years ago

@abidrahmank , I think like @CyrusYzGTt said you should not have a xorg.conf file.

Does /usr/sbin/bumblebee-nvidia --check

show anything useful?

abidrahmank commented 10 years ago

Hi, thank you @gsgatlin @CyrusYzGTt

I was little busy since my project presentation was next day, so I didn't wait. I reinstalled fedora.

So what is the condition of CUDA installation in fedora 19 with bumblebee ? Has anyone successfully done that? Any installation guide?

MrSampson commented 10 years ago

This is on my list of things to do. Real Soon Now. :-S

Upthread, there's info about how the nvidia packages rename some of the mesa pacakged files, so the trick is going to be to get the nvidia and mesa packages to coexist peacefully. When that happens bumblebee's nvidia packages can be deleted and the cuda packages should install. But I don't have any experience with packaging, so I just haven't gotten over the initial inertia to get started on it.

ArchangeGabriel commented 9 years ago

I didn’t read through that issue, but it’s quite old, and since there is already an issue open about Fedora packaging and another about current issue with nvidia specific to Fedora, is still something here usefull or should we close this issue?

MrSampson commented 9 years ago

Well, I haven't looked at this in a while, since the need has gone. Fedora 19 is almost EOL anyhow. After I move to Fedora 21, and if the need arises, I'll have a look again. If I can contribute anything worthwhile, and if the problem still exists, I'll open the issue again.