Bumblebee-Project / Bumblebee

Bumblebee daemon and client rewritten in C
http://www.bumblebee-project.org/
GNU General Public License v3.0
1.29k stars 144 forks source link

RPM Packages for Bumblebee #153

Open gsgatlin opened 12 years ago

gsgatlin commented 12 years ago

Hello.

I am creating packages for Red Hat type systems. I have a problem on CentOS 6 i686 only. The same package works fine on i686 and x86_64 fedora 15, 16, and 17 and x86_64 CentOS 6... Its quite odd. My laptop has 8 GB of RAM. It is a lenovo ideapad Y470. The rpms are stored at a yum repository at:

http://install.linux.ncsu.edu/pub/yum/itecs/public/bumblebee/

and

http://install.linux.ncsu.edu/pub/yum/itecs/public/bumblebee-nonfree/

bumblebeed will not stay running. If you reboot and start it by hand it will work for one use. Like:

optirun glxgears

but once you close the glxgears window bumblebeed crashes. It does not do this on a 64 bit OS on the same system. (E.G. CentOS 6.2 x86_64) On 64 bit everything is working great. I've generated a bumblebee-bugreport-20120512_225151.tar.gz file which I will email.

I have more info which I will add in another section because its very long....

gsgatlin commented 12 years ago

If I run bumblebeed by hand with the --debug flag, I get this output:

http://pastebin.com/kyvGsWii

Notice the part that starts with: * stack smashing detected *

I wonder if I need to use -fno-stack-protector

on this platform only?

If the red hat "spec file" is useful to help debug this issue I have pasted it to:

http://pastebin.com/0UqAKkG6

Let me know if you need any other info? I don't actually plan on using a 32 bit OS on this box at all but I figured some people might install a 32 bit CentOS 6 for some weird reason.

Lekensteyn commented 12 years ago

Can you compile with `CFLAGS="-g -O0" and paste a new log? Thanks.

gsgatlin commented 12 years ago

Sure. Thanks a lot for looking at this. I pasted the output at:

http://pastebin.com/ZfeSLYps

Looks like its still ending with the same message. * stack smashing detected *: /usr/sbin/bumblebeed terminated

gsgatlin commented 12 years ago

Hello. Sorry. I played around with the bumblebee spec file some more. I changed:

make %{?_smp_mflags} to make CFLAGS="-g -O0" %{?_smp_mflags}

This seems to have fixed the issue actually. bublebeed no longer seems to be crashing on i686.

http://pastebin.com/mBvJXLQH

So do you think these flags safe to use on all platforms or should this only be changed for centOS 6 i686 systems?

Or maybe there is still some issue that "-g -O0" is hiding? Any ideas appreciated.

Lekensteyn commented 12 years ago

I can't find something that may have caused it, perhaps the compiler is buggy. -O0 disables any optimizations, what about using -O2?

gsgatlin commented 12 years ago

Greetings. I tried:

make %{?_smp_mflags}

which produces a bumblebeed that crashes after one use on i686 centos 6.

make CFLAGS="-g -O2" %{?_smp_mflags}

which works perfectly on i686 centos 6. Quite odd! The Macro %{?_smp_mflags} should default to -j1 I think. It is being built inside a KVM virtual machine with one virtual CPU. Just "make" by itself also crashes after one use.

I believe %{?_smp_mflags} expands to -j[number of CPU's in the system]

I built all of these inside KVM vms because I do not own enough physical hardware to keep around all these systems.

The OS I am building this for is a little bit older because its an "Enterprise OS" so they backport bug fixes and security updates for all the software including gcc. the gcc is version 4.4.6 while in newish fedoras it is newer. Also the kernel is a bit older: gsgatlin@localhost specs]$ uname -a Linux localhost.localdomain 2.6.32-220.13.1.el6.i686 #1 SMP Tue Apr 17 22:09:08 BST 2012 i686 i686 i386 GNU/Linux

gcc versions in fedora... fedora 15: 4.6.3 fedora 16: 4.6.3 fedora 17: 4.7.0

In the fedoras I never saw any problems on 32 bit. Just this one OS and arch. Do you think I should add a conditional in my spec file for stuff older than fedora 15 to use:

make CFLAGS="-g -O2" %{?_smp_mflags}

and then fedora 15 and newer should use:

make %{?_smp_mflags}

Any ideas greatly appreciated.

Lekensteyn commented 12 years ago

Yh, I've looked it up and _smp_flags seems to add a -j flag. I wonder if it's the optimization flag (-Ox) or debugging flag (-g). Since it's likely a compiler issue I suggest you to try different optimization levels (0, 1, 2 or even 3) and see if it helps. If you know assembly a bit, you can also objdump it and diff it.

gsgatlin commented 12 years ago

Hello, Just to follow up, further tests indicate that it doesn't matter which switches are added. Its very strange.

make CFLAGS="-g -O2" %{?_smp_mflags} #works make CFLAGS="-O2" %{?_smp_mflags} #works make CFLAGS="-g" %{?_smp_mflags} #works make CFLAGS="-O3" %{?_smp_mflags} #works make CFLAGS="-O1" %{?_smp_mflags} #works make CFLAGS="-O0" %{?_smp_mflags} #works

make %{?_smp_mflags} #fails

make #fails

I guess maybe I should add a conditional for this distro and maybe use -O3 ?

gsgatlin commented 12 years ago

Oops. The comments on those last two were a typo. :) They were:

make %{?_smp_mflags} #fails make #fails

cheers

Lekensteyn commented 12 years ago

I think there is some global CFLAGS var that is being overwritten. Please check the output of configure while compiling using the spec file.

gsgatlin commented 12 years ago

Aha! that makes sense. Well... Quite the line.

CFLAGS='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i686 -mtune=atom -fasynchronous-unwind-tables'

Let me see if I can narrow down what is actually causing the issue. I'll try one by one to figure it out.

gsgatlin commented 12 years ago

It seems "-fstack-protector" has some effect on this.

make CFLAGS="-fstack-protector" #fails make CFLAGS="" # works

So perhaps any conditional would need to exclude "-fstack-protector" from the default CFLAGS. But only on fedoras <=15 and on arch i686 only?

gsgatlin commented 12 years ago

Just for a reference it looks like this spec file:

http://pastebin.com/VimAHTjp

builds with the correct CFLAGS on 32 bit and 64 bit VMs. It takes a few days to properly test however because I have only one laptop and so I have to re-install with fedoras and RHELs in my spare time at night. So let me know if you think that would be ok or if if there is a bug with bumblebeed and the -fstack-protector CFLAGS that could be fixed?

cheers,

Lekensteyn commented 12 years ago

Probably this bug. I know that there is a place where we use a char array of size 100: https://github.com/Bumblebee-Project/Bumblebee/blob/master/src/switch/sw_bbswitch.c#L40

gsgatlin commented 12 years ago

Thank you for your help with this problem.

I still need to do more tests. But I think I am mostly finished. Here are some URLs to some spec files I made in case I get hit by a car or something before I can write docs and go on IRC and if someone want to pick up where I left off:

bumblebee: http://pastebin.com/wqeZyJ9S bbswitch: http://pastebin.com/Q8j6xKWn acpi-handle-hack: http://pastebin.com/zfdK3DPK acpi-handle-hack-nouveau: http://pastebin.com/Gu2YHJJj bumblebee-nvidia-32: http://pastebin.com/FJEeT5c1 bumblebee-nvidia-64: http://pastebin.com/qAbF2fJL

I will try to write up some docs and then I'll be in touch on #bumblebee-dev on Freenode. The bumblebee package at:

http://install.linux.ncsu.edu/pub/yum/itecs/public/bumblebee/

and the nvidia drivers at:

http://install.linux.ncsu.edu/pub/yum/itecs/public/bumblebee-nonfree/

Have been updated. I will work on writing some docs next. I was able to build a pair of VirtualGL rpms on fedora but it was kind of ugly so I need some fedora experts to look at it to see how it could be built with a modern libjpeg-turbo in fedora to make this more acceptable to the community as a submission. (These problems are not there on RHEL 6/ centos 6)

Thanks again.

amonakov commented 11 years ago

Stack smashing looks like a real bug in bumblebee sources fixed now by commit 034f2ecd80239b93b929c650cdc9fd2a30a45791

gsgatlin commented 11 years ago

Updated package at review request here with patch to fix stack smashing:

https://bugzilla.redhat.com/show_bug.cgi?id=827167

Also, documentation has been updated on my docs site with some recent changes:

http://techies.ncsu.edu/wiki/bumblebee

Basically, things are stalled with getting bumblebee into fedora / EPEL due to problems with VirtualGL building in "rawhide" ("rawhide" is what will become fedora 18 when its ready)

See https://bugzilla.redhat.com/show_bug.cgi?id=839060 and https://bugzilla.redhat.com/show_bug.cgi?id=834127 for more information on that.

Assuming I were to have luck with bumblebee and VirtualGL in fedora / EPEL, I will look at doing a bbswitch and acpi-handle-hack in a third party repository such as "rpmfusion" and "elrepo."

I have not been hanging out in your developer IRC channel since everything is stalled and there is no certainty about getting these packages into fedora. If both packages were to be accepted into fedora and build for all branches at some point then I will start hanging out in there.

Cheers,

amonakov commented 11 years ago

It looks like your package is based on version 3.0.0, but 3.0.1 has been released, which includes some fixes, most importantly a fix for a critical bug affecting users with new Kepler-based laptops. The stack smashing fix is also included in 3.0.1.

gsgatlin commented 11 years ago

Cool. I'll update that to 3.0.1 then. Thanks for the info.

gsgatlin commented 11 years ago

Ok. Updated to version 3.0.1 at the fedora review request and also at my test repository. I tested it briefly with minecraft and it seems to be working well. Thanks again for the info on the new version.

gustavokrm commented 11 years ago

Hello, I used these rpm packages + the bumblebee-nvidia one and I am facing some problems as we speak. I cannot get software like Totem or Cheese to work with optirun -- no video. Also, wine also is not working properly. How can I fix this? I'm willing to supply info for you guys, just ask.

amonakov commented 11 years ago

What are you trying to achieve by running video players via optirun? Can you use Intel hardware decoding via libva? Bumblebee does not perform video decoding offloading, only OpenGL offloading.

Wine is a 32-bit program, if you're running it on a 64-bit OS via optirun, you'll need 32-bit VirtualGL.

gustavokrm commented 11 years ago

I was trying to test this, but it didn't work. Also, optirun bit.trip.runner, supertuxkart, whatever doesn't work either. I get no video either way, even if they are native games. I removed the bumblebee-nvidia package but the error persists. There is no real use for the card then.. =\ I know Wine is 32 bit, but I couldn't figure out how to make it work. It wouldn't work anyway, even with a 32 bit prefix, latest packages and 32 bit VirtualGL ( yes, i double-checked ). With the Intel card, everything functions as usual, but I cannot run games via wine.

gsgatlin commented 11 years ago

Hello. I am the author of these RPM packages. In the process of getting VirtualGL into fedora and EPEL, I think we/I broke multilib support. So if you have a 64 bit system I think running 32 bit apps is currently broken, both in the package in my repo, and the package in fedora "testing" for f16, f17, and f18. I will see what I can do to fix that. I have used optirun with the 64 bit version of bit.trip.runner so I know it works on at least my laptop... (With Nvidia driver, in bumblebee-nvidia package. I eat my own dog food, but I don't run wine which may be why I missed this.)

I'll reply back to this thread once I have some more info on this. The version on the VirtualGL site may not suffer from this bug, but it does not meet fedora packaging guidelines either. You may want to check that out if you can't wait?

The version currently at my repo will hit fedora in about 3 more days I think...

Cheers,

gsgatlin commented 11 years ago

I didn't remember that correctly. After looking again I get

Fatal error failed to create SDL window: Couldn't find matching GLX visual

When launching bit.trip.runner via optirun. But that particular game seems to perform fine on the intel card. I have used optirun on some other humble bundle games however. (Cogs was one that improved the most) So I'm pretty sure at least many 64 bit openGL apps work on fedora. May wish to try:

optirun glxgears

just to test the most basic functionality of your bumblebee setup.

Lekensteyn commented 11 years ago

You might be interested in https://github.com/amonakov/primus as well.

gsgatlin commented 11 years ago

Ok, If you are using my fedora/rhel bumblebee repo, do a yum update and that should fix multilib support. The version in fedora will be fixed soon.

If you are having trouble running

optirun glxgears

you may wish to run

/usr/bin/bumblebee-bugreport

to help figure out what is going on with your laptop.

Special thanks to Andy Kwong for the multilib patch.

gustavokrm commented 11 years ago

So, testing this against Fedora 18 beta now. Installing bumblebee went just fine. When I installed the bumblebee-nvidia package, I experienced a big problem: I couldn't reach the graphic interface. It would start blinking, trying to start gdm, but it couldn't. Removing this package and reinstalling bumblebee solved the problem.

gsgatlin commented 11 years ago

Yeah. I could not get it to work on fedora 18 either. I suspect it is a SELinux issue. I actually use CentOS 6 as my main desktop, not fedora 18 beta. So its working ok for my system. I do have a spare Y470 laptop with f18 beta on it. I will try to look at it tomorrow to see if it might be possible to figure out how to make it work. See also https://github.com/Bumblebee-Project/Bumblebee/issues/298 which is preventing me from testing bumblebee with the nouveau driver on my particular laptop make and model.

I know it worked on f17 the last time I tested it. Sorry about that. You might want to try building it by hand with:

/usr/sbin/bumblebee-nvidia --debug

to see if that works better. That was going to be the first thing I try with f18.

gustavokrm commented 11 years ago

Yes, thanks for the reply. I will try to work that out too..

2012/12/15 gsgatlin notifications@github.com

Yeah. I could not get it to work on fedora 18 either. I suspect it is a SELinux issue. I actually use CentOS 6 as my main desktop, not fedora 18 beta. So its working ok for my system. I do have a spare Y470 laptop with f18 beta on it. I will try to look at it tomorrow to see if it might be possible to figure out how to make it work. See also #298https://github.com/Bumblebee-Project/Bumblebee/issues/298which is preventing me from testing bumblebee with the nouveau driver on my particular laptop make and model.

I know it worked on f17 the last time I tested it. Sorry about that. You might want to try building it by hand with:

/usr/sbin/bumblebee-nvidia --debug

to see if that works better. That was going to be the first thing I try with f18.

— Reply to this email directly or view it on GitHubhttps://github.com/Bumblebee-Project/Bumblebee/issues/153#issuecomment-11412399.

gsgatlin commented 11 years ago

If I use the --opengl-libdir=lib64/nvidia-bumblebee and --x-module-path=/usr/lib64/nvidia-bumblebee/xorg/modules installer arguments in my script, I get the same "error: XORG Failed to load module "mouse" (module does not exist, 0) error I get with the nouveau driver in issue #298. I will keep trying next week as time permits and possibly try that technique on some older fedoras but it may be that my hardware is too weird, because of needing acpi-handle-hack, to continue to be able to get this working on my system. Which would make testing impossible. :( I am very sorry about that. If I can't get it working soon I will remove the fedora 18 bumblebee-nvidia rpms from my repo and web site and make a little note about why. I did try for several hours today but no luck and so I am quitting for tonight.

gsgatlin commented 11 years ago

Hello,

I have created a bumblebee-nvidia-304.64-2 package for f18. Can you let me know if this fixes the issues? I was able to get this to work on CentOS 6 x86_64 with few problems. I think it might fix the issues on f18. If it does I will push this update to CentOS 6 and f16/f17. If anyone else is following along the new shell script is visible at: http://pastebin.com/Qcd2k6nJ

Thanks a lot.

gustavokrm commented 11 years ago

So, I tested this today on Fedora 18 beta and it seems to be working fine so far. I was able to log-in, used optirun on a couple of native games and it worked too. Thanks!

Just a question: am I supposed to be able to access nvidia-settings? because running nvidia-xconfig breaks everything. (sorry, I'm not a tech user).

gsgatlin commented 11 years ago

I think your not supposed to use nvidia-xconfig because bumblebee uses a special xorg.conf for its screen. And I don't think you normally need to edit it. But if you do need to the file is at:

/etc/bumblebee/xorg.conf.nvidia

I also realized there is a bug for 32 bit installs so I'll be pushing out a new version of my script after I test it later today.

Thanks for the feedback and testing this rpm package.

gsgatlin commented 11 years ago

Hello,

So the 32 bit NVidia installer and my wrapper script + 32 bit CentOS 6 is giving me errors when I try to use optirun. Here is my script:

http://pastebin.com/56x6FQi2

The /var/log/messages log file has:

Dec 20 16:21:40 localhost kernel: pci 0000:01:00.0: power state changed by ACPI to D0 Dec 20 16:21:40 localhost kernel: pci 0000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 Dec 20 16:21:40 localhost bumblebeed[3540]: Loading driver nvidia (module nvidia) Dec 20 16:21:41 localhost kernel: nvidia 0000:01:00.0: power state changed by ACPI to D0 Dec 20 16:21:41 localhost kernel: nvidia 0000:01:00.0: power state changed by ACPI to D0 Dec 20 16:21:41 localhost kernel: nvidia 0000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 Dec 20 16:21:41 localhost kernel: vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=none,decodes=none:owns=none Dec 20 16:21:41 localhost kernel: NVRM: loading NVIDIA UNIX x86 Kernel Module 310.19 Wed Nov 7 23:22:09 PST 2012 Dec 20 16:21:41 localhost bumblebeed[3540]: Starting X server on display :8. Dec 20 16:21:43 localhost kernel: vmap allocation for size 16781312 failed: use vmalloc= to increase size. Dec 20 16:21:43 localhost bumblebeed[3540]: XORG NVIDIA(0): Failed to initialize the NVIDIA GPU at PCI:1:0:0. Please Dec 20 16:21:43 localhost bumblebeed[3540]: XORG NVIDIA(0): check your system's kernel log for additional error Dec 20 16:21:43 localhost bumblebeed[3540]: XORG NVIDIA(0): messages and refer to Chapter 8: Common Problems in the Dec 20 16:21:43 localhost bumblebeed[3540]: XORG NVIDIA(0): README for additional information. Dec 20 16:21:43 localhost bumblebeed[3540]: XORG NVIDIA(0): Failed to initialize the NVIDIA graphics device! Dec 20 16:21:43 localhost bumblebeed[3540]: XORG NVIDIA(0): Failing initialization of X screen 0 Dec 20 16:21:43 localhost bumblebeed[3540]: XORG Screen(s) found, but none have a usable configuration. Dec 20 16:21:43 localhost kernel: NVRM: RmInitAdapter failed! (0x26:0xffffffff:1057) Dec 20 16:21:43 localhost kernel: NVRM: rm_init_adapter(0) failed Dec 20 16:21:43 localhost bumblebeed[3540]: X did not start properly

This script seems to work ok on a 64 bit system with the same distro. (RHEL 6 x86_64) (My production box)

The box has 8 GB of RAM. Anyone have any ideas what I could be doing wrong? I notice the "vmap allocation for size 16781312 failed: use vmalloc= to increase size." right before it fails. Could that have something to do with why its not working?

Thanks for any ideas anyone has... Its very odd that the 64 bit version works ok and the 32 bit version doesn't.

gsgatlin commented 11 years ago

Ok. I discovered that the issue on CentOS / RHEL 6 x86 with bumblebee-nvidia has been around for a while. So I'll have to figure that out separately. I pushed out a update to bumblebee-nvidia to 310.19-2 today. I tested it on f18, and though it was tricky to get a box to the state where I could test it, optirun did work ok on both x86 and x86_64. (You need to disable the updates-testing repo for now because of timing to be able to use yum correctly) The new version also works fine on f17/f16/el6 x86_64. I believe f18 comes out on Jan 8th. So we should be good to go when it officially comes out. Thanks a lot for reporting this issue.

The fixed up shell script is posted here: http://pastebin.com/0KueJYvg

Question for any bumblebee developers concerning "primus." Is primus intended to work with the nouveau driver or only the nvidia driver? I get the impression that primus is only for nvidia?

I am going to try to build a rpm package for primus to go in my repo and so I wanted to understand that aspect of it better. I guess it could also matter if I were to build a rpm package it would affect where it could be submitted for review.

I was able to build primus from source on a CentOS 6 x86_64 box with some third party packages added so I know it works. I have not tested any games yet, however.

Thanks,

amonakov commented 11 years ago

primus is driver-agnostic, so it should work with both nvidia and nouveau.

Watch out: if Mesa libGL is built without shared libglapi, primus will not work. I'm told this is the case for F16 and F17.

gsgatlin commented 11 years ago

Hello Alexander,

Thank you for your hard work creating primus software.

I have had some success on fedora 17 with a rpm package I created. It works fine with the nvidia drivers. So, it only works when I have the nvidia-bumblebee rpm installed. If I remove it and try to run something using the nouveau driver I get this error:

[gsgatlin@localhost ~]$ primusrun glxgears primus: fatal: failed to load any of the libraries: /usr/$LIB/nvidia-bumblebee/libGL.so.1

/usr/$LIB/nvidia-bumblebee/libGL.so.1: cannot open shared object file: No such file or directory

This is not surprising since /usr/$LIB/nvidia-bumblebee/ doesn't exist once I remove bumblebee-nvidia.

(optirun works fine in this situation, with nouveau)

Is there something that would need to be changed within primusrun to make it work with the nouveau driver? Perhaps the value of the PRIMUS_libGLa variable needs to be modified within primusrun in this situation?

Alternatively I could just make the package depend on bumblebee-nvidia as a "Requires:" line. This means it would only be a nvidia driver solution. Let me know what you think would be the best way to handle that?

Also, is this software licensed using the ISC license? I ask due to the license tag in the fedora/ RHEL rpm spec file which is required. If not what license does this software use?

I was unsure about a version so I used 0.0.date_from_the_last_commit since other distros seem to be doing that.

Here is a link to my first attempt at a rpm spec file:

http://pastebin.com/33BVKfv3

I will try to test this on x86_64 RHEL 6 later tonight after I re-install my spare laptop.

Thanks a lot.

If anyone were to need to install a 32 bit and 64 bit primus rpm, you use: (once my repos are set up with primus which there are not quite yet)

yum install primus primus.i686

on a x86_64 system which is different than some other rpm based distros are doing like opensuse. I did test this and it works fine with nvidia driver. I did not have a chance to test any games yet. My 32 bit tests were done by scp-ing a glxgears executable from a 32 bit vm onto the 64 bit laptop to test with optirun and primusrun.

amonakov commented 11 years ago

Is there something that would need to be changed within primusrun to make it work with the nouveau driver? Perhaps the value of the PRIMUS_libGLa variable needs to be modified within primusrun in this situation?

Yes, one can override PRIMUS_libGLa in primusrun script to point to Mesa libGL (without rebuilding primus).

Alternatively I could just make the package depend on bumblebee-nvidia as a "Requires:" line. This means it would only be a nvidia driver solution. Let me know what you think would be the best way to handle that?

That is what I did when packaging primus for Arch Linux and Gentoo, and IIRC how currently available Ubuntu packages are built as well. So far, primus was not tested much with open-source nvidia drivers.

Also, is this software licensed using the ISC license?

Correct.

I see that your rpm spec "requires" mesa-libGLES for primus. Is it because that package contains libglapi.so.0? Then I'm surprised your tests passed, because it implies that mesa-libGL is built without shared libglapi, and primus would segfault on the first OpenGL call in Mesa. It also should not "require" llvm-libs.

gsgatlin commented 11 years ago

Hello. Thanks so much for your feedback. Here is attempt #2.

http://pastebin.com/hmMsQVMD

Here is the patch called primus-0.0-nouveaufix.patch

http://pastebin.com/kQjeAGG4

I had some issues I'm afraid. I guess I do not understand the role of libglapi.so.0 with primus. I originally thought it was required to compile it. That is why I did what I did with all the "Requires" in attempt #1. (Throw everything at the wall and see what sticks, haha)

I was able to build this on RHEL 6 and fedora 17. It installs and works awesomely when you use a benchmarking utility such as glxgears.

But when I try to play a game such as minecraft, the performance is not very good. like 15 fps compared to 50-80 fps with VirtualGL.

I thought this might be limited to RHEL 6 but I set up a fedora 17 laptop this afternoon and put minecraft on it and its performance is not so good also. Here are some screenshots...

This is launching minecraft via "primusrun"

http://i.imgur.com/SMjfm.png

This is launching minecraft via "optirun"

http://i.imgur.com/rzFts.png

I am unsure why it performs so poorly. I suspect it may have something to do with the libglapi.so.0 stuff you mentioned. Let me try to explain it as best I can with my limited understanding.

On fedora 17, there is a source package called mesa. The actual package name is mesa-8.0.4-1.fc17.src.rpm

When this package gets built, a big build directory is worked through and all parts of mesa are compiled. It actually downloads several tarballs and does make a few times after a configure section.

Then, various bits of "mesa" are assembled into different "subpackages"

mesa-libGLES is one of these "subpackages" from "mesa"

It looks like this in the spec file: ... %package libEGL Summary: Mesa libEGL runtime libraries Group: System Environment/Libraries Requires(post): /sbin/ldconfig Requires(postun): /sbin/ldconfig

%description libEGL Mesa libEGL runtime libraries

%package libGLES Summary: Mesa libGLES runtime libraries Group: System Environment/Libraries Requires(post): /sbin/ldconfig Requires(postun): /sbin/ldconfig

%description libGLES Mesa GLES runtime libraries

%package dri-filesystem ...

The fact that it is in a separate "rpm" file is just a fedora/red hat thing to split functionality up AFAIK.

When I did attempt #1, I saw that the make file had this "libglapi.so.0" file featured prominently so I assumed I would need mesa-libGLES to build the primus rpm. (Since that rpm contains that file) But that turned out to be not the case. It will happily build even with an older mesa (7.11-5) from RHEL 6 that as far as a I know doesn't even have a mesa-libGLES or libglapi.so.0 available. That is why I thought I was going to need a newer version from elrepo. But, that is not the case. It builds fine on the RHEL 6 laptop but is just as slow at running minecraft as fedora 17 was.

Is there any particular games that are a good test for the primus software? I have purchased Humble Indie Bundle 3, 4, 6, and 7 and I could also possible download some kind of open source game if it is known to work well with primus to test it... So far I have only tested "Cogs" (Just picked this one at random) and "MInecraft" and both appeared to be very slow with primus. But the glxgears benchmark ran insanely fast. Very odd.

Anyways, let me know what you think I should try next or if you think this is hopeless with a Red Hat type distro?

I could put these in my yum repository but if they work so slowly I'd be afraid people might be mad. But maybe the slowness is limited to my setup somehow. Just not sure how to go about troubleshooting that.

I'm pretty happy with optirun but would like to get primusrun working if that is the future of hybrid graphics on Linux...

I'll leave fedora 17 on that spare laptop for a few days in case I can think of something else....

Thanks for your time.

gsgatlin commented 11 years ago

One thought I had was that if there was a shared libglapi problem with the mesa-8.0.4-1.fc17.src.rpm, perhaps there would be a way to tell, like with ldd ? Then perhaps a make line could be tweaked to get around that? Just a thought. I also found out that

vblank_mode=0 primusrun glxgears

is very fast on f17 and runs a long time whereas the same command hard locks my CentOS 6 (RHEL 6 with mesa 7.11-5 and no libglapi.so.0) after a few seconds and I have to power cycle it. But both systems run a long time if I leave off the vblank_mode=0 part at about like 60 fps. I'm getting 1700 fps on f17 with the "vblank_mode=0 primusrun glxgears" command. (same nvidia driver version on both rhel 6 and f17)

Cheers,

amonakov commented 11 years ago

Here is the patch called primus-0.0-nouveaufix.patch

Could you drop it? Power users who want to play with nouveau can do that themselves, and I prefer that defaults are the same everywhere.

The actual package name is mesa-8.0.4-1.fc17.src.rpm

I looked at it and it seems to package libglapi.so.0 into a separate libglapi rpm, not into mesa-GLES, since March 28, 2012. That is correct. If you have libglapi.so.0 provided by mesa-GLES, you probably have an outdated system?

Can you show output of

LIBGL_DEBUG=verbose PRIMUS_VERBOSE=2 primusrun glxgears

and the same for a slow case (with Cogs or Minecraft).

gsgatlin commented 11 years ago

Sure. I will do a version 3 a little bit later today without the patch and I'll try the tests.

Please note that by dropping that patch primus cannot be submitted to the fedora distro for a formal review. This is because they would not allow a package into their distro that depends on any closed source software without user modifications. They are pretty militant about that. So the primus package will have to be a solution that just lives in my private repository at NCSU. (And whoever wants to copy it to their own repo) But I'm totally cool with that. I'll add a "Requires" for "bumblebee-nvidia"

Thanks a lot.

gsgatlin commented 11 years ago

Hello. Here is RHEL 6:

http://pastebin.com/Z0sabeLw

RHEL 6 with cogs:

http://pastebin.com/dvpgrji2

Fedora 17 with glxgears:

http://pastebin.com/A1GWCjLq

Fedora 17 with minecraft:

http://pastebin.com/Usm3na63

Here is a little text file I made that may help to explain "mesa" on various supported Red Hat type systems...

http://pastebin.com/m870qbB8

Here is my attempt 3 at a spec file:

http://pastebin.com/bbxFwEQg

If anyone is following along who wants to play with these for some reason, the packages I made so far are on the www at:

http://install.linux.ncsu.edu/pub/yum/itecs/public/bumblebee/test/

Also, here is a picture of cogs running with optirun on my main RHEL 6 box:

http://i.imgur.com/xhxjd.jpg

And here is it running with primus. Notice how it looks "weird" compared to optirun. (Not shiny or ray traced looking)

http://i.imgur.com/ChHFM.jpg

Let me know if I can try anything else here. I have the sources to various mesa rpms on all of my virtual machines so it would be trivial to rebuild a given "mesa" with any modifications I was directed to make in any of the spec files.

Thanks for your help.

amonakov commented 11 years ago

Hm, judging from pastebin'd logs, it looks like primus is not loaded at all in your tests (although it's strange that Cogs look different). And indeed, it seems you've removed sed statement from the spec file in attempts two and three. Why? Unless you modify primusrun before packaging, that sed statement is necessary.

gsgatlin commented 11 years ago

Hello. when

PRIMUS_libGL='/usr/$LIB/primus'

is set on a fedora 17 box, it works with starting glxgears, but on a RHEL 6 system, the following error is generated:

primus: fatal: failed to load any of the libraries: /usr/$LIB/nvidia-bumblebee/libGL.so.1 libnvidia-tls.so.310.19: cannot open shared object file: No such file or directory

So when I took the sed out, the error went away. So I assumed it was not needed or it was a problem. Any ideas what I should try for RHEL 6 and this error?

gsgatlin commented 11 years ago

Hmnn. If I do this before running the script with PRIMUS_libGL='/usr/$LIB/primus' on RHEL 6:

export LD_LIBRARY_PATH=/usr/lib64/nvidia-bumblebee [gsgatlin@localhost ~]$ /usr/bin/primusrun-test glxgears primus: fatal: failed to load PRIMUS_LOAD_GLOBAL

So maybe thats a sign I need to get the newer mesa that had a libglapi.so.0 in it? The one from elrepo for RHEL 6 that is made with mesa 8? But I guess /usr/lib64/nvidia-bumblebee also needs to get set up in the primusrun script with the LD_LIBRARY_PATH for RHEL 6 somehow also?

gsgatlin commented 11 years ago

It gets interesting/different if I try the same exact thing on fedora 17. It works, but the gears spin extremely slow/clunky. Also, this is displayed:

export LD_LIBRARY_PATH=/usr/lib64/nvidia-bumblebee [gsgatlin@localhost ~]$ ./primusrun-test glxgears Xlib: extension "NV-GLX" missing on display ":0". Running synchronized to the vertical refresh. The framerate should be approximately the same as the monitor refresh rate. 343 frames in 5.3 seconds = 65.149 FPS 300 frames in 5.0 seconds = 59.694 FPS

Hope this information helps...

amonakov commented 11 years ago

libnvidia-tls.so.310.19: cannot open shared object file: No such file or directory

See a comment in the primusrun script about libnvidia-tls.

So maybe thats a sign I need to get the newer mesa that had a libglapi.so.0 in it?

Yes.

Xlib: extension "NV-GLX" missing on display ":0".

This means that primus is not loaded and nVidia libGL is used to draw on Intel card, resulting in bad performance.

gsgatlin commented 11 years ago

One last comment for today... I see now the section where it says

On some distributions, e.g. on Gentoo, libnvidia-tls.so is not available

in default search paths. Add its path manually after the primus library

PRIMUS_libGL=${PRIMUS_libGL}:/usr/\$LIB/opengl/nvidia/lib

and adding:

PRIMUS_libGL=${PRIMUS_libGL}:/usr/\$LIB/bumblebee-nvidia

produces the same results. Extremely slow clunky on f17 and primus: fatal: failed to load PRIMUS_LOAD_GLOBAL on RHEL 6 but I'm guessing that would be the "proper" fix or whatever if it worked properly on these distros.

Hmnn. OK. I just saw your comment now. Sorry. hehe. So I'll have to modify that bit as well. ok. I can do that. But for the slow clunkyness... So for fedora 17, how could I fix the issue with primus not being loaded and nVidia libGL being used to draw on the Intel card. Any ideas? Or at least maybe there could be some way to see why it fails to load?

Thanks so much and have a good evening.