Bumblebee-Project / Bumblebee

Bumblebee daemon and client rewritten in C
http://www.bumblebee-project.org/
GNU General Public License v3.0
1.29k stars 144 forks source link

Bumblebee not working on Lenovo Thinkpad P50 with Centos 6.8-6.9 and 7.3-7.5 #974

Open wrthissell opened 5 years ago

wrthissell commented 5 years ago

Dear Colleagues, I previously had Bumblebee working on a Lenovo Thinkpad W541 with Centos 6.8. I no longer have access to that computer. I moved the partitions to a Thinkpad P50, yum update to Centos 6.9, and I also installed Centos 7.3-7.5. I have been unable to get Bumblebee to work on the new computer and the Centos versions.
I submitted a bug report last November at:

https://bugs.launchpad.net/lpbugreporter/+bug/752542

A search for my username will result in the files I uploaded then. Here are my current results: [wrthissell@new-host-4 ~]$ uname -r 3.10.0-862.9.1.el7.x86_64 [wrthissell@new-host-4 ~]$ optirun --debug glxgears64 [ 233.925594] [DEBUG]optirun version 3.2.1 starting... [ 233.925631] [DEBUG]Active configuration: [ 233.925645] [DEBUG] bumblebeed config file: /etc/bumblebee/bumblebee.conf [ 233.925668] [DEBUG] X display: :8 [ 233.925689] [DEBUG] LD_LIBRARY_PATH: /usr/lib64/nvidia-bumblebee:/usr/lib/nvidia-bumblebee:/usr/lib64:/usr/lib [ 233.925703] [DEBUG] Socket path: /var/run/bumblebee.socket [ 233.925719] [DEBUG] Accel/display bridge: auto [ 233.925733] [DEBUG] VGL Compression: proxy [ 233.925752] [DEBUG] VGLrun extra options: [ 233.925766] [DEBUG] Primus LD Path: /usr/lib/primus:/usr/lib32/primus:/usr/lib64/primus [ 233.925892] [DEBUG]Using auto-detected bridge virtualgl [ 238.692727] [INFO]Response: No - error: [XORG] (EE) /dev/dri/card0: failed to set DRM interface version 1.4: Permission denied

[ 238.692776] [ERROR]Cannot access secondary GPU - error: [XORG] (EE) /dev/dri/card0: failed to set DRM interface version 1.4: Permission denied

[ 238.692803] [DEBUG]Socket closed. [ 238.692865] [ERROR]Aborting because fallback start is disabled. [ 238.692899] [DEBUG]Killing all remaining processes. [wrthissell@new-host-4 ~]$

     I have tried both the NCSU and the elrepo builds.  I currently have the NCSU distribution installed.
     I have previously made two posts on this issue at the elrepo bug log: 

http://elrepo.org/bugs/view.php?id=737 http://elrepo.org/bugs/view.php?id=742

     Thank you in advance for your assistance in helping me resolve the issue. 

20170725 Bumblebee Debug.tar.gz

gsgatlin commented 5 years ago

Are you using centos 7 on that machine? I'm using fedora but I have a spare machine I can try tomorrow with centos 6 or 7 depending on which version you are running. Its been a month and a half since I last tested centos. I made the packages hosted at NCSU.

wrthissell commented 5 years ago

Dear gsgatlin, Thank you very much for your prompt reply. Yes, I am using Centos 7.5, build number listed near the top of my submission under the uname -r command. I suspect the issue is related to the Lenovo Thinkpad P50 and my grub kernel load options. I have read so many issue blog threads on different machines and OS builds that provided guidance on different kernel load options to resolve the issue. The Lenovo Thinkpad W541 is the model immediately before the Thinkpad P50. Thank you again in advance for your assistance.

gsgatlin commented 5 years ago

Hmnn. I installed bumblebee on a ideapad y470 notebook (A pretty old machine) using the instructions from:

https://www.linux.ncsu.edu/bumblebee

And it appears to be working ok with virtualgl and primus bridge.

CentOS 7. Kernel is latest at 3.10.0-862.9.1.el7.x86_64

[gsgatlin@y470 ~]$ cat /etc/redhat-release CentOS Linux release 7.5.1804 (Core)

Did the drivers build ok? What is the output from:

bumblebee-nvidia --check

?

I see there is a

https://github.com/Bumblebee-Project/Bumblebee/issues/580

which seems to be similar problem?

Also some notes about it here... https://wiki.debian.org/Bumblebee

gsgatlin commented 5 years ago

Here were some fedora issues which were similar. I know you are running centos 7 but its a similar distro to fedora.

https://github.com/Bumblebee-Project/Bumblebee/issues/859

https://github.com/Bumblebee-Project/Bumblebee/issues/824

https://github.com/Bumblebee-Project/Bumblebee/issues/42

Besides posting output from

bumblebee-nvidia --check

Can you paste output from journalctl or /var/log/messages to https://pastebin.com/ or https://paste.fedoraproject.org/ or similar site right after you run optirun or primusrun and it fails?

I'm pretty sure the elrepo bumblebee packages have been abandoned by their creator sadly. (Rob Mokkink)

Thanks,

wrthissell commented 5 years ago

Dear gsgatlin, Thank you for your help. [wrthissell@LAPTOP-BKIJEPGK ~]$ bumblebee-nvidia --check

nvidia.ko compiled into in the kernel tree ok. modinfo output for NVIDIA:

filename: /lib/modules/3.10.0-862.9.1.el7.x86_64/kernel/drivers/video/nvidia.ko alias: char-major-195- version: 390.48 supported: external license: NVIDIA retpoline: Y rhelversion: 7.5 srcversion: FA33B00C00A6F70EC9CF314 alias: pci:v000010DEd00000E00svsdbc04sc80i00 alias: pci:v000010DEdsvsdbc03sc02i00 alias: pci:v000010DEdsvsdbc03sc00i00 depends: ipmi_msghandler,i2c-core vermagic: 3.10.0-862.9.1.el7.x86_64 SMP mod_unload modversions parm: NVreg_Mobile:int parm: NVreg_ResmanDebugLevel:int parm: NVreg_RmLogonRC:int parm: NVreg_ModifyDeviceFiles:int parm: NVreg_DeviceFileUID:int parm: NVreg_DeviceFileGID:int parm: NVreg_DeviceFileMode:int parm: NVreg_UpdateMemoryTypes:int parm: NVreg_InitializeSystemMemoryAllocations:int parm: NVreg_UsePageAttributeTable:int parm: NVreg_MapRegistersEarly:int parm: NVreg_RegisterForACPIEvents:int parm: NVreg_CheckPCIConfigSpace:int parm: NVreg_EnablePCIeGen3:int parm: NVreg_EnableMSI:int parm: NVreg_TCEBypassMode:int parm: NVreg_UseThreadedInterrupts:int parm: NVreg_EnableStreamMemOPs:int parm: NVreg_EnableBacklightHandler:int parm: NVreg_EnableUserNUMAManagement:int parm: NVreg_EnableIBMNPURelaxedOrderingMode:int parm: NVreg_MemoryPoolSize:int parm: NVreg_IgnoreMMIOCheck:int parm: NVreg_RegistryDwords:charp parm: NVreg_RegistryDwordsPerDevice:charp parm: NVreg_RmMsg:charp parm: NVreg_AssignGpus:charp

Check bbswitch kernel module...

bbswitch is loaded into the current kernel ok.

All checks completed successfully! NVIDIA driver appears to have compiled ok.

Documentation on bumblebee for RHEL / CentOS / fedora can be found at: https://www.linux.ncsu.edu/bumblebee/

20180728 messages.tar.gz

I tried pastebin.com, but I could not figure out how to upload a file. The raw messages file is about 369 MB, but it compressed down to 14 MB, so I uploaded it here for your reference.

Thank you again for your help.

gsgatlin commented 5 years ago

Hello.

pastebin doesn't allow you to upload a file. You paste in text to a box and hit submit and it makes a temporary place to look at text output. But I was able to download your file for the messages syslog.

So I'm seeing a couple of really weird things here... It may be that you've already fixed these and started out from elrepo but I want to make sure.

The first is that the executable is actually called glxgears (not glxgears64 which is what you were trying to run)

[gsgatlin@logicbomb ~]$ which glxgears /usr/bin/glxgears [gsgatlin@logicbomb ~]$ rpm -qf /usr/bin/glxgears glx-utils-8.2.0-3.el7.x86_64

The second involves the line in your output when first reporting the bug:

[ 233.925766] [DEBUG] Primus LD Path: /usr/lib/primus:/usr/lib32/primus:/usr/lib64/primus

I my box I get:

[gsgatlin@t540p ~]$ optirun --debug glxgears [11397.451970] [DEBUG]optirun version 3.2.1 starting... [11397.452025] [DEBUG]Active configuration: [11397.452031] [DEBUG] bumblebeed config file: /etc/bumblebee/bumblebee.conf [11397.452043] [DEBUG] X display: :8 [11397.452079] [DEBUG] LD_LIBRARY_PATH: /usr/lib64/nvidia-bumblebee:/usr/lib/nvidia-bumblebee:/usr/lib64:/usr/lib [11397.452093] [DEBUG] Socket path: /var/run/bumblebee.socket [11397.452102] [DEBUG] Accel/display bridge: auto [11397.452111] [DEBUG] VGL Compression: proxy [11397.452123] [DEBUG] VGLrun extra options: [11397.452131] [DEBUG] Primus LD Path: /usr/lib/primus:/usr/lib64/primus [11397.452185] [DEBUG]Using auto-detected bridge virtualgl [11398.036825] [INFO]Response: Yes. X is active.

[11398.036841] [INFO]Running application using virtualgl. [11398.036944] [DEBUG]Process vglrun started, PID 19012.

(HERE IS WHERE I CLOSE SPINNING GEARS WINDOW)

[VGL] ERROR: in readback-- [VGL] 254: Window has been deleted by window manager [11401.643736] [DEBUG]SIGCHILD received, but wait failed with No child processes [11401.643753] [DEBUG]Socket closed. [11401.643764] [DEBUG]Killing all remaining processes.

Notice the line in my output:

[11397.452131] [DEBUG] Primus LD Path: /usr/lib/primus:/usr/lib64/primus

I wonder if the packages have gotten mixed up somehow with someone else's packages? I'm pretty sure this PATH is generated by:

export CONF_PRIMUS_LD_PATH="/usr/lib/primus:/usr/lib64/primus"

within my bumbleee rpm spec file at compile time. (Sources to all packages are at: https://github.com/gsgatlin/optimus-rpms )

Because on your box its trying to use debian style paths when you first reported this. Or maybe "all systems type paths" but its not what I made which is more red hat family specific.

[gsgatlin@t540p bumblebee]$ rpm -q bumblebee --changelog | grep gsgatlin

I would try removing all bumblebee* bbswitch, primus, virtualgl packages, temporarily disabling elrepo yum repository, then adding back the packages from my repo and rebooting. Then try it again and see what you get. You can always add elrepo back while blocking certain packages like bumblebee, primus, bbswitch, etc if there are other packages you need from elrepo.

wrthissell commented 5 years ago

Dear gsgatlin, I am reviewing:

https://forums.opensuse.org/showthread.php/505270-ultimate-tutorial-installing-Bumblebee-driver-for-SUSE Xorg.8.log

bumblebee-nvidia.conf.txt bumblebee.conf.txt

Here is the /etc/modeprobe.d/bumblebee.conf:

bumblebee.conf.modprobe.d.txt

[wrthissell@LAPTOP-BKIJEPGK ~]$ nvidia-settings -c :8

ERROR: Unable to find display on any available system

Thank you again for your help.

[wrthissell@LAPTOP-BKIJEPGK ~]$ systemctl -l status bumblebeed ● bumblebeed.service - Bumblebee C Daemon Loaded: loaded (/usr/lib/systemd/system/bumblebeed.service; enabled; vendor preset: disabled) Active: active (running) since Sat 2018-07-28 13:42:41 EDT; 5h 6min ago Main PID: 1171 (bumblebeed) CGroup: /system.slice/bumblebeed.service └─1171 /usr/sbin/bumblebeed

Jul 28 15:44:18 LAPTOP-BKIJEPGK bumblebeed[1171]: [ 7303.023667] [ERROR][XORG] (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA graphics device! Jul 28 15:44:18 LAPTOP-BKIJEPGK bumblebeed[1171]: [ 7303.023677] [ERROR][XORG] (EE) NVIDIA(0): Failing initialization of X screen 0 Jul 28 15:44:18 LAPTOP-BKIJEPGK bumblebeed[1171]: [ 7303.023684] [ERROR][XORG] (EE) Screen(s) found, but none have a usable configuration. Jul 28 15:44:18 LAPTOP-BKIJEPGK bumblebeed[1171]: [ 7303.023691] [ERROR][XORG] (EE) Jul 28 15:44:18 LAPTOP-BKIJEPGK bumblebeed[1171]: [ 7303.023698] [ERROR][XORG] (EE) no screens found(EE) Jul 28 15:44:18 LAPTOP-BKIJEPGK bumblebeed[1171]: [ 7303.023704] [ERROR][XORG] (EE) Jul 28 15:44:18 LAPTOP-BKIJEPGK bumblebeed[1171]: [ 7303.023711] [ERROR][XORG] (EE) Please also check the log file at "/var/log/Xorg.8.log" for additional information. Jul 28 15:44:18 LAPTOP-BKIJEPGK bumblebeed[1171]: [ 7303.023719] [ERROR][XORG] (EE) Jul 28 15:44:18 LAPTOP-BKIJEPGK bumblebeed[1171]: [ 7303.023725] [ERROR][XORG] (EE) Server terminated with error (1). Closing log file. Jul 28 15:44:18 LAPTOP-BKIJEPGK bumblebeed[1171]: [ 7303.024411] [ERROR]X did not start properly

gsgatlin commented 5 years ago

Yeah. my packages don't make a /etc/modeprobe.d/bumblebee.conf

Also, my /etc/bumblebee/bumblebee.conf doesn't set PrimusLibraryPath So you probbaly want to remove that line.

gsgatlin commented 5 years ago

I think the forum post is specific to suse. With my packages hosted at ncsu the driver is contained within the bumblebee-nvidia package.

gsgatlin commented 5 years ago

[gsgatlin@t540p ~]$ cat /etc/modprobe.d/bumblebee.conf blacklist nvidia blacklist nouveau

gsgatlin commented 5 years ago

Sorry, correction, there is a /etc/modprobe.d/bumblebee.conf with the modules listed in previous comment.

gsgatlin commented 5 years ago

Also, the command to start nvidia settings is:

optirun -b none nvidia-settings -c :8

So try that.

wrthissell commented 5 years ago

Dear gsgatlin, Thank you for your help. I have executed the instructions to remove via yum bumblebee bbswitch primus virtualgl, and reinstall. I used dracut --regenerate-all --force afterward, with reboots after erasing, and then after yum install. Here are the results:

[wrthissell@LAPTOP-BKIJEPGK ~]$ optirun --debug glxgears [ 85.089686] [DEBUG]optirun version 3.2.1 starting... [ 85.089771] [DEBUG]Active configuration: [ 85.089809] [DEBUG] bumblebeed config file: /etc/bumblebee/bumblebee.conf [ 85.089842] [DEBUG] X display: :8 [ 85.089880] [DEBUG] LD_LIBRARY_PATH: /usr/lib64/nvidia-bumblebee:/usr/lib/nvidia-bumblebee:/usr/lib64:/usr/lib [ 85.089920] [DEBUG] Socket path: /var/run/bumblebee.socket [ 85.089959] [DEBUG] Accel/display bridge: auto [ 85.089997] [DEBUG] VGL Compression: proxy [ 85.090053] [DEBUG] VGLrun extra options: [ 85.090087] [DEBUG] Primus LD Path: /usr/lib/primus:/usr/lib64/primus [ 85.090366] [DEBUG]Using auto-detected bridge virtualgl [ 89.977518] [INFO]Response: No - error: [XORG] (EE) /dev/dri/card0: failed to set DRM interface version 1.4: Permission denied

[ 89.977532] [ERROR]Cannot access secondary GPU - error: [XORG] (EE) /dev/dri/card0: failed to set DRM interface version 1.4: Permission denied

[ 89.977537] [DEBUG]Socket closed. [ 89.977565] [ERROR]Aborting because fallback start is disabled. [ 89.977572] [DEBUG]Killing all remaining processes.

[wrthissell@LAPTOP-BKIJEPGK ~]$ bumblebee-nvidia --check

nvidia.ko compiled into in the kernel tree ok. modinfo output for NVIDIA:

filename: /lib/modules/3.10.0-862.9.1.el7.x86_64/kernel/drivers/video/nvidia.ko alias: char-major-195- version: 390.48 supported: external license: NVIDIA retpoline: Y rhelversion: 7.5 srcversion: FA33B00C00A6F70EC9CF314 alias: pci:v000010DEd00000E00svsdbc04sc80i00 alias: pci:v000010DEdsvsdbc03sc02i00 alias: pci:v000010DEdsvsdbc03sc00i00 depends: ipmi_msghandler,i2c-core vermagic: 3.10.0-862.9.1.el7.x86_64 SMP mod_unload modversions parm: NVreg_Mobile:int parm: NVreg_ResmanDebugLevel:int parm: NVreg_RmLogonRC:int parm: NVreg_ModifyDeviceFiles:int parm: NVreg_DeviceFileUID:int parm: NVreg_DeviceFileGID:int parm: NVreg_DeviceFileMode:int parm: NVreg_UpdateMemoryTypes:int parm: NVreg_InitializeSystemMemoryAllocations:int parm: NVreg_UsePageAttributeTable:int parm: NVreg_MapRegistersEarly:int parm: NVreg_RegisterForACPIEvents:int parm: NVreg_CheckPCIConfigSpace:int parm: NVreg_EnablePCIeGen3:int parm: NVreg_EnableMSI:int parm: NVreg_TCEBypassMode:int parm: NVreg_UseThreadedInterrupts:int parm: NVreg_EnableStreamMemOPs:int parm: NVreg_EnableBacklightHandler:int parm: NVreg_EnableUserNUMAManagement:int parm: NVreg_EnableIBMNPURelaxedOrderingMode:int parm: NVreg_MemoryPoolSize:int parm: NVreg_IgnoreMMIOCheck:int parm: NVreg_RegistryDwords:charp parm: NVreg_RegistryDwordsPerDevice:charp parm: NVreg_RmMsg:charp parm: NVreg_AssignGpus:charp

Check bbswitch kernel module...

bbswitch is loaded into the current kernel ok.

All checks completed successfully! NVIDIA driver appears to have compiled ok.

Documentation on bumblebee for RHEL / CentOS / fedora can be found at: https://www.linux.ncsu.edu/bumblebee/

I am attaching the most recent /var/log/messages as a tar.gz file:

20180729 messages.tar.gz

May the issue be related to my kernel load command? I have added several options in the year and half that I have been trying to resolve the issue. Here is the latest grub.cfg:

20180729 grub.cfg.txt

Here is the dmesg command:

[wrthissell@LAPTOP-BKIJEPGK ~]$ sudo dmesg | grep -C 10 bbswitch [sudo] password for wrthissell: [ 4.467622] SELinux: 2048 avtab hash slots, 106550 rules. [ 4.488354] SELinux: 8 users, 14 roles, 5009 types, 310 bools, 1 sens, 1024 cats [ 4.488356] SELinux: 97 classes, 106550 rules [ 4.491903] SELinux: Completing initialization. [ 4.491904] SELinux: Setting up existing superblocks. [ 4.507170] systemd[1]: Successfully loaded SELinux policy in 120.715ms. [ 4.517681] systemd[1]: RTC configured in localtime, applying delta of -240 minutes to system time. [ 4.530295] ip_tables: (C) 2000-2006 Netfilter Core Team [ 4.530333] systemd[1]: Inserted module 'iptables' [ 4.549660] systemd[1]: Relabelled /dev and /run in 17.453ms. [ 4.706908] bbswitch: loading out-of-tree module taints kernel. [ 4.707003] bbswitch: module verification failed: signature and/or required key missing - tainting kernel [ 4.707285] bbswitch: version 0.8 [ 4.707290] bbswitch: Found integrated VGA device 0000:00:02.0: _SB.PCI0.GFX0 [ 4.707295] bbswitch: Found discrete VGA device 0000:01:00.0: _SB.PCI0.PEG0.PEGP [ 4.707304] ACPI Warning: _SB.PCI0.PEG0.PEGP._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20130517/nsarguments-95) [ 4.707435] bbswitch: detected an Optimus _DSM function [ 4.707447] pci 0000:01:00.0: enabling device (0000 -> 0003) [ 4.707492] bbswitch: Succesfully loaded. Discrete card 0000:01:00.0 is on [ 4.724896] systemd-journald[601]: Received request to flush runtime journal from PID 1 [ 4.852164] random: crng init done [ 4.943340] resource sanity check: requesting [mem 0xfed10000-0xfed17fff], which spans more than pnp 00:01 [mem 0xfed10000-0xfed13fff] [ 4.943345] caller ie31200_init_one+0x100/0x580 [ie31200_edac] mapping multiple BARs [ 4.943616] EDAC MC0: Giving out device to 'ie31200_edac' 'IE31200': DEV 0000:00:00.0 [ 4.944462] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4 [ 4.946473] i801_smbus 0000:00:1f.4: enabling device (0000 -> 0003) [ 4.946621] i801_smbus 0000:00:1f.4: SMBus using PCI interrupt [ 4.955260] thinkpad_acpi: ThinkPad ACPI Extras v0.25 [ 4.955262] thinkpad_acpi: http://ibm-acpi.sf.net/

[ 68.909113] ACPI Warning: _SB_.PCI0.PEG0.PEGP.DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20130517/nsarguments-95) [ 68.909190] ACPI Warning: _SB.PCI0.PEG0.PEGP.DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20130517/nsarguments-95) [ 68.909239] ACPI Warning: _SB.PCI0.PEG0.PEGP._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20130517/nsarguments-95) [ 73.132680] NVRM: failed to copy vbios to system memory. [ 73.133200] NVRM: RmInitAdapter failed! (0x30:0xffff:663) [ 73.133356] NVRM: rm_init_adapter failed for device bearing minor number 0 [ 73.133535] [drm:nv_drm_load [nvidia_drm]] ERROR [nvidia-drm] [GPU ID 0x00000100] Failed to allocate NvKmsKapiDevice [ 73.134185] [drm:nv_drm_probe_devices [nvidiadrm]] ERROR [nvidia-drm] [GPU ID 0x00000100] Failed to register device [ 73.365835] nvidia-modeset: Unloading [ 73.366208] nvidia-nvlink: Unregistered the Nvlink Core, major device number 238 [ 73.368368] bbswitch: disabling discrete graphics [ 73.368376] ACPI Warning: _SB.PCI0.PEG0.PEGP._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20130517/nsarguments-95) [ 73.379610] pci_raw_set_power_state: 3 callbacks suppressed [ 73.379614] pci 0000:01:00.0: Refused to change power state, currently in D0 [ 84.727835] bbswitch: enabling discrete graphics [ 85.026789] nvidia-nvlink: Nvlink Core is being initialized, major device number 238 [ 85.027043] vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=none,decodes=none:owns=none [ 85.027144] NVRM: loading NVIDIA UNIX x8664 Kernel Module 390.48 Thu Mar 22 00:42:57 PDT 2018 (using threaded interrupts) [ 85.055968] nvidia 0000:01:00.0: irq 144 for MSI/MSI-X [ 85.057871] ACPI Warning: _SB.PCI0.PEG0.PEGP.DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20130517/nsarguments-95) [ 85.057941] ACPI Warning: _SB.PCI0.PEG0.PEGP.DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20130517/nsarguments-95) [ 85.057994] ACPI Warning: _SB.PCI0.PEG0.PEGP.DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20130517/nsarguments-95) [ 85.058048] ACPI Warning: _SB.PCI0.PEG0.PEGP.DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20130517/nsarguments-95) [ 85.058102] ACPI Warning: _SB.PCI0.PEG0.PEGP.DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20130517/nsarguments-95) [ 85.058179] ACPI Warning: _SB.PCI0.PEG0.PEGP._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20130517/nsarguments-95)

Thank you again for your help.

gsgatlin commented 5 years ago

Does it change anything if you remove

rcutree.rcu_idle_gp_delay=1 nvidia-drm.modeset=1 pcie_port_pm=off acpi=on acpi_rev_override=5

from kernel boot? (Edit /etc/default/grub and run either grub2-mkconfig -o /boot/grub2/grub.cfg or grub2-mkconfig -o /boot/efi/EFI/centos/grub.cfg depending on which one you have or both if both files exists)

Can you upload /var/log/Xorg.8.log ?

if removing kernel arguments "rcutree.rcu_idle_gp_delay=1 nvidia-drm.modeset=1 pcie_port_pm=off acpi=on acpi_rev_override=5" doesn't help?

Is hybrid graphics set up in the BIOS/UEFI rather then to always use discrete card? Evidently this is a setting that can be controlled with this particular notebook?

Also, it turns out that "error: [XORG] (EE) /dev/dri/card0: failed to set DRM interface version 1.4: Permission denied" is bogus and we can just ignore that. This always fail due to problems in libdrm, but does not affect Bumblebee's functionality. See https://github.com/Bumblebee-Project/Bumblebee/issues/652 for more information. I know its annoying to have useless error messages but there is little we can do about that...

wrthissell commented 5 years ago

Dear gsgatlin, Thank you for your assistance in helping me resolve this issue. I removed the above lines from the file you referenced and I executed the grub2-mkconfig command as you recommended above and then rebooted. My BIOS is configured to use hybrid graphics. Here are the results: [wrthissell@LAPTOP-BKIJEPGK ~]$ optirun --debug glxgears [ 71.865971] [DEBUG]optirun version 3.2.1 starting... [ 71.865996] [DEBUG]Active configuration: [ 71.866009] [DEBUG] bumblebeed config file: /etc/bumblebee/bumblebee.conf [ 71.866018] [DEBUG] X display: :8 [ 71.866026] [DEBUG] LD_LIBRARY_PATH: /usr/lib64/nvidia-bumblebee:/usr/lib/nvidia-bumblebee:/usr/lib64:/usr/lib [ 71.866035] [DEBUG] Socket path: /var/run/bumblebee.socket [ 71.866050] [DEBUG] Accel/display bridge: auto [ 71.866063] [DEBUG] VGL Compression: proxy [ 71.866071] [DEBUG] VGLrun extra options: [ 71.866087] [DEBUG] Primus LD Path: /usr/lib/primus:/usr/lib64/primus [ 71.866177] [DEBUG]Using auto-detected bridge virtualgl [ 76.456951] [INFO]Response: No - error: [XORG] (EE) /dev/dri/card0: failed to set DRM interface version 1.4: Permission denied

[ 76.456967] [ERROR]Cannot access secondary GPU - error: [XORG] (EE) /dev/dri/card0: failed to set DRM interface version 1.4: Permission denied

[ 76.456973] [DEBUG]Socket closed. [ 76.456990] [ERROR]Aborting because fallback start is disabled. [ 76.456997] [DEBUG]Killing all remaining processes.

  I read https://github.com/Bumblebee-Project/Bumblebee/issues/652 above.  One of the recommended changes was disabling the WIFI card and then trying again.  Here are the results:

Same results as above.

  I read last November that the conflict with libdrm was caused to another user with this issue by an update in libdrm from 2.56 to 2.58.  I tried at that time to update the libdrm rpm to the latest version at that time, 2.88.  That did not work, so I reverted back to the stock libdrm with the Centos release.
   Near the bottom of issue link above is a patch:

http://pastebin.com/raw/ea5jKJ15

   Is this patch part of your package?  If not, may we try it?
   Here is a link to issue 771:

https://github.com/Bumblebee-Project/Bumblebee/pull/771

   I noticed that there is a merge into the develop branch.  I tried last year to create an rpm from the development branch and it did not work once installed.  I received a segmentation error.  I suspect it is because I do not have sufficient knowledge to build rpms from github development branches for bumblebee.
  Thank you again for your assistance.
wrthissell commented 5 years ago

Dear gsgatlin, Below are the log files after I made the kernel load command changes you recommended:

Xorg.8.log.txt Xorg.0.log.txt

gsgatlin commented 5 years ago

The main problem seem to be

"Failed to initialize the NVIDIA GPU at PCI:1:0:0. Please"

and

"Failed to initialize the NVIDIA graphics device"

But I am unsure why this is happening. Perhaps one of the bumblebee developers will have some ideas assuming some of them still are reading these github issues.

I pushed out a never version of the driver today. Does "yum update bumblebee-nvidia" followed by a reboot help? Should be bumblebee-nvidia-390.77-1 you would be updating to. I doubt it could help but its worth a try...

This seems very similar to https://github.com/Bumblebee-Project/Bumblebee/issues/542

Maybe you should add back rcutree.rcu_idle_gp_delay=1 but not the other arguments? Just to see if might help.... Or maybe if you add the arguments back you could try them all one at at time rather then all in one go?

I have access to 4 different optimus laptops through my job plus ones I bought with my own money.. Three of them require no kernel arguments. The fourth requires only one argument: "acpi_osi=!Windows 2013" with fedora 25/26/27.

A laptop a student brought to the help desk required acpi_osi=!Windows 2009 which I think was one of those dell xps machines. But I don't remember which one it was.

I can look at adding that patch. I think it just hides the bogus error messages. I can look at it tomorrow. I haven't looked at doing a new release in a long time because there hasn't been a new official version since 3.2.1.

https://github.com/Bumblebee-Project/Bumblebee/commit/fb3d960fd8facbcfa58381701454c6853966a704

If I have time to add that patch I will put the package on the web where you can download it to try it out.

wrthissell commented 5 years ago

Dear gsgatlin, I just tried:

sudo yum update bumblebee-nvidia*

It could not find an update. I then manually checked:

https://linux.itecs.ncsu.edu/redhat/public/bumblebee-nonfree/rhel7/x86_64/

and the update is not there. Is it somewhere else? Thank you again for your assistance.

gsgatlin commented 5 years ago

Sorry. I see what happened. Try it again. There was a bug in my staging script.

yum clean all

yum update bumblebee-nvidia

wrthissell commented 5 years ago

Dear gsgatlin, Thank you again for your assistance. I found and installed the update, rebooted without any of the below kernel load commands, and then duplicated the error with optirun --debug glxgears. I then added and replaced one by one each of the following commands in the grub.cfg:

rcutree.rcu_idle_gp_delay=1 nvidia-drm.modeset=1 pcie_port_pm=off acpi=on acpi_rev_override=5 acpi_osi=!Windows 2013 acpi_osi=!Windows 2009

Here are the identical results I received after each reboot:

[wrthissell@LAPTOP-BKIJEPGK ~]$ optirun --debug glxgears [ 98.242558] [DEBUG]optirun version 3.2.1 starting... [ 98.242588] [DEBUG]Active configuration: [ 98.242598] [DEBUG] bumblebeed config file: /etc/bumblebee/bumblebee.conf [ 98.242608] [DEBUG] X display: :8 [ 98.242617] [DEBUG] LD_LIBRARY_PATH: /usr/lib64/nvidia-bumblebee:/usr/lib/nvidia-bumblebee:/usr/lib64:/usr/lib [ 98.242634] [DEBUG] Socket path: /var/run/bumblebee.socket [ 98.242643] [DEBUG] Accel/display bridge: auto [ 98.242659] [DEBUG] VGL Compression: proxy [ 98.242673] [DEBUG] VGLrun extra options: [ 98.242687] [DEBUG] Primus LD Path: /usr/lib/primus:/usr/lib64/primus [ 98.242807] [DEBUG]Using auto-detected bridge virtualgl [ 102.744923] [INFO]Response: No - error: [XORG] (EE) /dev/dri/card0: failed to set DRM interface version 1.4: Permission denied

[ 102.744965] [ERROR]Cannot access secondary GPU - error: [XORG] (EE) /dev/dri/card0: failed to set DRM interface version 1.4: Permission denied

[ 102.744985] [DEBUG]Socket closed. [ 102.745075] [ERROR]Aborting because fallback start is disabled. [ 102.745094] [DEBUG]Killing all remaining processes. [wrthissell@LAPTOP-BKIJEPGK ~]$ bumblebee-nvidia --check

nvidia.ko compiled into in the kernel tree ok. modinfo output for NVIDIA:

filename: /lib/modules/3.10.0-862.9.1.el7.x86_64/kernel/drivers/video/nvidia.ko alias: char-major-195- version: 390.77 supported: external license: NVIDIA retpoline: Y rhelversion: 7.5 srcversion: 209B1D0CB123DE466F700AD alias: pci:v000010DEd00000E00svsdbc04sc80i00 alias: pci:v000010DEdsvsdbc03sc02i00 alias: pci:v000010DEdsvsdbc03sc00i00 depends: ipmi_msghandler,i2c-core vermagic: 3.10.0-862.9.1.el7.x86_64 SMP mod_unload modversions parm: NVreg_Mobile:int parm: NVreg_ResmanDebugLevel:int parm: NVreg_RmLogonRC:int parm: NVreg_ModifyDeviceFiles:int parm: NVreg_DeviceFileUID:int parm: NVreg_DeviceFileGID:int parm: NVreg_DeviceFileMode:int parm: NVreg_UpdateMemoryTypes:int parm: NVreg_InitializeSystemMemoryAllocations:int parm: NVreg_UsePageAttributeTable:int parm: NVreg_MapRegistersEarly:int parm: NVreg_RegisterForACPIEvents:int parm: NVreg_CheckPCIConfigSpace:int parm: NVreg_EnablePCIeGen3:int parm: NVreg_EnableMSI:int parm: NVreg_TCEBypassMode:int parm: NVreg_UseThreadedInterrupts:int parm: NVreg_EnableStreamMemOPs:int parm: NVreg_EnableBacklightHandler:int parm: NVreg_EnableUserNUMAManagement:int parm: NVreg_EnableIBMNPURelaxedOrderingMode:int parm: NVreg_MemoryPoolSize:int parm: NVreg_IgnoreMMIOCheck:int parm: NVreg_RegistryDwords:charp parm: NVreg_RegistryDwordsPerDevice:charp parm: NVreg_RmMsg:charp parm: NVreg_AssignGpus:charp

Check bbswitch kernel module...

bbswitch is loaded into the current kernel ok.

All checks completed successfully! NVIDIA driver appears to have compiled ok.

Documentation on bumblebee for RHEL / CentOS / fedora can be found at: https://www.linux.ncsu.edu/bumblebee/

    Thank you again for your assistance.
wrthissell commented 5 years ago

Dear gsgatlin, Thank you for your assistance. I just referenced this issue at:

https://www.centos.org/forums/viewtopic.php?f=49&t=66178&p=284988#p284988

   Someone there may have resolved the issue, but no details are posted.
gsgatlin commented 5 years ago

Hello.

I looked at that patch you linked. The bottom part is already being done (and more) in the bumblebee-modprobefix.patch...

https://paste.fedoraproject.org/paste/aGFXGhCXNoXQgbpi4HHDag

I created a new package in a new branch called experimental1

https://github.com/gsgatlin/optimus-rpms/tree/experimental1 https://github.com/gsgatlin/optimus-rpms/commit/8cf9524efc3e2381767071d26a89b74d179d8965

It appears to remove "failed to set DRM interface version 1.4: Permission denied" from the syslog. It does still appear in my /var/log/Xorg.8.log and I am unsure it would appear on command line for your situation. I don't think it would fix underlying problem sadly. ( I get that error message but my bumbleee works fine in centos 7 and fedora 27)

Still, I compiled it and placed it on the web at this url:

https://linux.itecs.ncsu.edu/redhat/public/bumblebee/experimental1/

Package name is bumblebee-3.2.1-14.el7.x86_64.rpm

I can look into building packages off the develop branch of

https://github.com/bumblebee-Project/Bumblebee/

and

https://github.com/Bumblebee-Project/bbswitch

if there is time tomorrow or the Friday. Perhaps I can make the version number be bumblebee 3.3 ? and bbswitch 0.8.5 ? based off of the develop branch of both repos. I'm just making up version numbers here.

wrthissell commented 5 years ago

Dear gsgatlin, Thank you for your assistance. I installed the files using yum. Here are the results with the acpi_osi=!Windows 2009 kernel load command:

[wrthissell@LAPTOP-BKIJEPGK ~]$ bumblebee-nvidia --check

nvidia.ko compiled into in the kernel tree ok. modinfo output for NVIDIA:

filename: /lib/modules/3.10.0-862.9.1.el7.x86_64/kernel/drivers/video/nvidia.ko alias: char-major-195- version: 390.77 supported: external license: NVIDIA retpoline: Y rhelversion: 7.5 srcversion: 209B1D0CB123DE466F700AD alias: pci:v000010DEd00000E00svsdbc04sc80i00 alias: pci:v000010DEdsvsdbc03sc02i00 alias: pci:v000010DEdsvsdbc03sc00i00 depends: ipmi_msghandler,i2c-core vermagic: 3.10.0-862.9.1.el7.x86_64 SMP mod_unload modversions parm: NVreg_Mobile:int parm: NVreg_ResmanDebugLevel:int parm: NVreg_RmLogonRC:int parm: NVreg_ModifyDeviceFiles:int parm: NVreg_DeviceFileUID:int parm: NVreg_DeviceFileGID:int parm: NVreg_DeviceFileMode:int parm: NVreg_UpdateMemoryTypes:int parm: NVreg_InitializeSystemMemoryAllocations:int parm: NVreg_UsePageAttributeTable:int parm: NVreg_MapRegistersEarly:int parm: NVreg_RegisterForACPIEvents:int parm: NVreg_CheckPCIConfigSpace:int parm: NVreg_EnablePCIeGen3:int parm: NVreg_EnableMSI:int parm: NVreg_TCEBypassMode:int parm: NVreg_UseThreadedInterrupts:int parm: NVreg_EnableStreamMemOPs:int parm: NVreg_EnableBacklightHandler:int parm: NVreg_EnableUserNUMAManagement:int parm: NVreg_EnableIBMNPURelaxedOrderingMode:int parm: NVreg_MemoryPoolSize:int parm: NVreg_IgnoreMMIOCheck:int parm: NVreg_RegistryDwords:charp parm: NVreg_RegistryDwordsPerDevice:charp parm: NVreg_RmMsg:charp parm: NVreg_AssignGpus:charp

Check bbswitch kernel module...

bbswitch is loaded into the current kernel ok.

All checks completed successfully! NVIDIA driver appears to have compiled ok.

Documentation on bumblebee for RHEL / CentOS / fedora can be found at: https://www.linux.ncsu.edu/bumblebee/

[wrthissell@LAPTOP-BKIJEPGK ~]$ optirun --debug glxgears [ 725.209794] [DEBUG]optirun version 3.2.1 starting... [ 725.209821] [DEBUG]Active configuration: [ 725.209833] [DEBUG] bumblebeed config file: /etc/bumblebee/bumblebee.conf [ 725.209843] [DEBUG] X display: :8 [ 725.209851] [DEBUG] LD_LIBRARY_PATH: /usr/lib64/nvidia-bumblebee:/usr/lib/nvidia-bumblebee:/usr/lib64:/usr/lib [ 725.209861] [DEBUG] Socket path: /var/run/bumblebee.socket [ 725.209872] [DEBUG] Accel/display bridge: auto [ 725.209881] [DEBUG] VGL Compression: proxy [ 725.209891] [DEBUG] VGLrun extra options: [ 725.209901] [DEBUG] Primus LD Path: /usr/lib/primus:/usr/lib64/primus [ 725.210016] [DEBUG]Using auto-detected bridge virtualgl [ 729.676402] [INFO]Response: No - error: [XORG] (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA GPU at PCI:1:0:0. Please

[ 729.676462] [ERROR]Cannot access secondary GPU - error: [XORG] (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA GPU at PCI:1:0:0. Please

[ 729.676489] [DEBUG]Socket closed. [ 729.676551] [ERROR]Aborting because fallback start is disabled. [ 729.676577] [DEBUG]Killing all remaining processes.

I shall now try without the acpi_osi=!Windows 2009 kernel load command.

wrthissell commented 5 years ago

Dear gsgatlin, I forgot to post the dmesg command:

[wrthissell@LAPTOP-BKIJEPGK ~]$ sudo dmesg | grep -C 10 bbswitch [sudo] password for wrthissell: [ 4.486811] SELinux: 2048 avtab hash slots, 106550 rules. [ 4.507958] SELinux: 8 users, 14 roles, 5009 types, 310 bools, 1 sens, 1024 cats [ 4.507960] SELinux: 97 classes, 106550 rules [ 4.511530] SELinux: Completing initialization. [ 4.511531] SELinux: Setting up existing superblocks. [ 4.524539] systemd[1]: Successfully loaded SELinux policy in 111.107ms. [ 4.533079] systemd[1]: RTC configured in localtime, applying delta of -240 minutes to system time. [ 4.544290] ip_tables: (C) 2000-2006 Netfilter Core Team [ 4.544315] systemd[1]: Inserted module 'iptables' [ 4.560576] systemd[1]: Relabelled /dev and /run in 14.812ms. [ 4.719312] bbswitch: loading out-of-tree module taints kernel. [ 4.719387] bbswitch: module verification failed: signature and/or required key missing - tainting kernel [ 4.719691] bbswitch: version 0.8 [ 4.719696] bbswitch: Found integrated VGA device 0000:00:02.0: _SB.PCI0.GFX0 [ 4.719701] bbswitch: Found discrete VGA device 0000:01:00.0: _SB.PCI0.PEG0.PEGP [ 4.719710] ACPI Warning: _SB.PCI0.PEG0.PEGP._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20130517/nsarguments-95) [ 4.719844] bbswitch: detected an Optimus _DSM function [ 4.719859] pci 0000:01:00.0: enabling device (0000 -> 0003) [ 4.719918] bbswitch: Succesfully loaded. Discrete card 0000:01:00.0 is on [ 4.735530] systemd-journald[603]: Received request to flush runtime journal from PID 1 [ 4.865536] random: crng init done [ 4.963129] resource sanity check: requesting [mem 0xfed10000-0xfed17fff], which spans more than pnp 00:01 [mem 0xfed10000-0xfed13fff] [ 4.963134] caller ie31200_init_one+0x100/0x580 [ie31200_edac] mapping multiple BARs [ 4.963364] EDAC MC0: Giving out device to 'ie31200_edac' 'IE31200': DEV 0000:00:00.0 [ 4.964079] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4 [ 4.967541] i801_smbus 0000:00:1f.4: enabling device (0000 -> 0003) [ 4.967689] i801_smbus 0000:00:1f.4: SMBus using PCI interrupt [ 4.969882] thinkpad_acpi: ThinkPad ACPI Extras v0.25 [ 4.969884] thinkpad_acpi: http://ibm-acpi.sf.net/

[ 6.261541] type=1130 audit(1533166852.982:81): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-tmpfiles-setup comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 6.276031] type=1305 audit(1533166852.996:82): audit_enabled=1 old=1 auid=4294967295 ses=4294967295 subj=system_u:system_r:auditd_t:s0 res=1 [ 6.276042] type=1305 audit(1533166852.996:83): audit_pid=1076 old=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:auditdt:s0 res=1 [ 6.290044] RPC: Registered named UNIX socket transport module. [ 6.290046] RPC: Registered udp transport module. [ 6.290047] RPC: Registered tcp transport module. [ 6.290048] RPC: Registered tcp NFSv4.1 backchannel transport module. [ 6.587969] [drm] [nvidia-drm] [GPU ID 0x00000100] Unloading driver [ 6.588871] nvidia-modeset: Unloading [ 6.589491] nvidia-nvlink: Unregistered the Nvlink Core, major device number 239 [ 6.591608] bbswitch: disabling discrete graphics [ 6.591619] ACPI Warning: _SB.PCI0.PEG0.PEGP._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20130517/nsarguments-95) [ 6.608262] pci 0000:01:00.0: Refused to change power state, currently in D0 [ 6.725980] psmouse serio2: trackpoint: IBM TrackPoint firmware: 0x0e, buttons: 3/3 [ 6.920695] input: TPPS/2 IBM TrackPoint as /devices/platform/i8042/serio1/serio2/input/input14 [ 6.954090] ip6_tables: (C) 2000-2006 Netfilter Core Team [ 6.986252] Ebtables v2.0 registered [ 7.038494] nf_conntrack version 0.5.0 (65536 buckets, 262144 max) [ 7.104495] IPv6: ADDRCONF(NETDEV_UP): enp0s31f6: link is not ready [ 7.272582] e1000e 0000:00:1f.6: irq 132 for MSI/MSI-X [ 7.373495] e1000e 0000:00:1f.6: irq 132 for MSI/MSI-X

[ 22.019120] tun: (C) 1999-2004 Max Krasnyansky maxk@qualcomm.com [ 22.020833] virbr0: port 1(virbr0-nic) entered blocking state [ 22.020838] virbr0: port 1(virbr0-nic) entered disabled state [ 22.020905] device virbr0-nic entered promiscuous mode [ 22.183549] virbr0: port 1(virbr0-nic) entered blocking state [ 22.183554] virbr0: port 1(virbr0-nic) entered listening state [ 22.183604] IPv6: ADDRCONF(NETDEV_UP): virbr0: link is not ready [ 22.282710] virbr0: port 1(virbr0-nic) entered disabled state [ 35.693634] fuse init (API version 7.22) [ 39.955306] rfkill: input handler disabled [ 725.182639] bbswitch: enabling discrete graphics [ 725.494502] nvidia-nvlink: Nvlink Core is being initialized, major device number 239 [ 725.494690] vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=none,decodes=none:owns=none [ 725.494756] NVRM: loading NVIDIA UNIX x8664 Kernel Module 390.77 Tue Jul 10 18:28:52 PDT 2018 (using threaded interrupts) [ 725.538807] nvidia 0000:01:00.0: irq 144 for MSI/MSI-X [ 725.557139] ACPI Warning: _SB.PCI0.PEG0.PEGP.DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20130517/nsarguments-95) [ 725.557226] ACPI Warning: _SB.PCI0.PEG0.PEGP.DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20130517/nsarguments-95) [ 725.557276] ACPI Warning: _SB.PCI0.PEG0.PEGP.DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20130517/nsarguments-95) [ 725.557328] ACPI Warning: _SB.PCI0.PEG0.PEGP.DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20130517/nsarguments-95) [ 725.557380] ACPI Warning: _SB.PCI0.PEG0.PEGP.DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20130517/nsarguments-95) [ 725.557464] ACPI Warning: _SB.PCI0.PEG0.PEGP._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20130517/nsarguments-95)

wrthissell commented 5 years ago

Dear gsgatlin, Thank you for your assistance. Here are the results with the new build without any kernel special load commands:

[wrthissell@LAPTOP-BKIJEPGK ~]$ bumblebee-nvidia --check

nvidia.ko compiled into in the kernel tree ok. modinfo output for NVIDIA:

filename: /lib/modules/3.10.0-862.9.1.el7.x86_64/kernel/drivers/video/nvidia.ko alias: char-major-195- version: 390.77 supported: external license: NVIDIA retpoline: Y rhelversion: 7.5 srcversion: 209B1D0CB123DE466F700AD alias: pci:v000010DEd00000E00svsdbc04sc80i00 alias: pci:v000010DEdsvsdbc03sc02i00 alias: pci:v000010DEdsvsdbc03sc00i00 depends: ipmi_msghandler,i2c-core vermagic: 3.10.0-862.9.1.el7.x86_64 SMP mod_unload modversions parm: NVreg_Mobile:int parm: NVreg_ResmanDebugLevel:int parm: NVreg_RmLogonRC:int parm: NVreg_ModifyDeviceFiles:int parm: NVreg_DeviceFileUID:int parm: NVreg_DeviceFileGID:int parm: NVreg_DeviceFileMode:int parm: NVreg_UpdateMemoryTypes:int parm: NVreg_InitializeSystemMemoryAllocations:int parm: NVreg_UsePageAttributeTable:int parm: NVreg_MapRegistersEarly:int parm: NVreg_RegisterForACPIEvents:int parm: NVreg_CheckPCIConfigSpace:int parm: NVreg_EnablePCIeGen3:int parm: NVreg_EnableMSI:int parm: NVreg_TCEBypassMode:int parm: NVreg_UseThreadedInterrupts:int parm: NVreg_EnableStreamMemOPs:int parm: NVreg_EnableBacklightHandler:int parm: NVreg_EnableUserNUMAManagement:int parm: NVreg_EnableIBMNPURelaxedOrderingMode:int parm: NVreg_MemoryPoolSize:int parm: NVreg_IgnoreMMIOCheck:int parm: NVreg_RegistryDwords:charp parm: NVreg_RegistryDwordsPerDevice:charp parm: NVreg_RmMsg:charp parm: NVreg_AssignGpus:charp

Check bbswitch kernel module...

bbswitch is loaded into the current kernel ok.

All checks completed successfully! NVIDIA driver appears to have compiled ok.

Documentation on bumblebee for RHEL / CentOS / fedora can be found at: https://www.linux.ncsu.edu/bumblebee/

[wrthissell@LAPTOP-BKIJEPGK ~]$ optirun --debug glxgears [ 138.941362] [DEBUG]optirun version 3.2.1 starting... [ 138.941394] [DEBUG]Active configuration: [ 138.941414] [DEBUG] bumblebeed config file: /etc/bumblebee/bumblebee.conf [ 138.941431] [DEBUG] X display: :8 [ 138.941447] [DEBUG] LD_LIBRARY_PATH: /usr/lib64/nvidia-bumblebee:/usr/lib/nvidia-bumblebee:/usr/lib64:/usr/lib [ 138.941469] [DEBUG] Socket path: /var/run/bumblebee.socket [ 138.941487] [DEBUG] Accel/display bridge: auto [ 138.941502] [DEBUG] VGL Compression: proxy [ 138.941517] [DEBUG] VGLrun extra options: [ 138.941532] [DEBUG] Primus LD Path: /usr/lib/primus:/usr/lib64/primus [ 138.941643] [DEBUG]Using auto-detected bridge virtualgl [ 143.503654] [INFO]Response: No - error: [XORG] (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA GPU at PCI:1:0:0. Please

[ 143.503669] [ERROR]Cannot access secondary GPU - error: [XORG] (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA GPU at PCI:1:0:0. Please

[ 143.503676] [DEBUG]Socket closed. [ 143.503695] [ERROR]Aborting because fallback start is disabled. [ 143.503702] [DEBUG]Killing all remaining processes. [wrthissell@LAPTOP-BKIJEPGK ~]$ sudo optirun --debug glxgears [ 150.690793] [DEBUG]optirun version 3.2.1 starting... [ 150.690824] [DEBUG]Active configuration: [ 150.690834] [DEBUG] bumblebeed config file: /etc/bumblebee/bumblebee.conf [ 150.690846] [DEBUG] X display: :8 [ 150.690860] [DEBUG] LD_LIBRARY_PATH: /usr/lib64/nvidia-bumblebee:/usr/lib/nvidia-bumblebee:/usr/lib64:/usr/lib [ 150.690873] [DEBUG] Socket path: /var/run/bumblebee.socket [ 150.690885] [DEBUG] Accel/display bridge: auto [ 150.690896] [DEBUG] VGL Compression: proxy [ 150.690905] [DEBUG] VGLrun extra options: [ 150.690917] [DEBUG] Primus LD Path: /usr/lib/primus:/usr/lib64/primus [ 150.690981] [DEBUG]Using auto-detected bridge virtualgl [ 155.028500] [INFO]Response: No - error: [XORG] (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA GPU at PCI:1:0:0. Please

[ 155.028550] [ERROR]Cannot access secondary GPU - error: [XORG] (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA GPU at PCI:1:0:0. Please

[ 155.028577] [DEBUG]Socket closed. [ 155.028654] [ERROR]Aborting because fallback start is disabled. [ 155.028679] [DEBUG]Killing all remaining processes. [wrthissell@LAPTOP-BKIJEPGK ~]$ sudo dmesg | grep -C 10 bbswitch [ 4.739539] SELinux: 2048 avtab hash slots, 106550 rules. [ 4.760287] SELinux: 8 users, 14 roles, 5009 types, 310 bools, 1 sens, 1024 cats [ 4.760290] SELinux: 97 classes, 106550 rules [ 4.763858] SELinux: Completing initialization. [ 4.763859] SELinux: Setting up existing superblocks. [ 4.776790] systemd[1]: Successfully loaded SELinux policy in 98.988ms. [ 4.785304] systemd[1]: RTC configured in localtime, applying delta of -240 minutes to system time. [ 4.796376] ip_tables: (C) 2000-2006 Netfilter Core Team [ 4.796433] systemd[1]: Inserted module 'iptables' [ 4.813495] systemd[1]: Relabelled /dev and /run in 15.705ms. [ 4.973328] bbswitch: loading out-of-tree module taints kernel. [ 4.973410] bbswitch: module verification failed: signature and/or required key missing - tainting kernel [ 4.973658] bbswitch: version 0.8 [ 4.973663] bbswitch: Found integrated VGA device 0000:00:02.0: _SB.PCI0.GFX0 [ 4.973669] bbswitch: Found discrete VGA device 0000:01:00.0: _SB.PCI0.PEG0.PEGP [ 4.973677] ACPI Warning: _SB.PCI0.PEG0.PEGP._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20130517/nsarguments-95) [ 4.973805] bbswitch: detected an Optimus _DSM function [ 4.973817] pci 0000:01:00.0: enabling device (0000 -> 0003) [ 4.973869] bbswitch: Succesfully loaded. Discrete card 0000:01:00.0 is on [ 4.994631] systemd-journald[605]: Received request to flush runtime journal from PID 1 [ 5.098497] random: crng init done [ 5.223917] resource sanity check: requesting [mem 0xfed10000-0xfed17fff], which spans more than pnp 00:01 [mem 0xfed10000-0xfed13fff] [ 5.223921] caller ie31200_init_one+0x100/0x580 [ie31200_edac] mapping multiple BARs [ 5.224176] EDAC MC0: Giving out device to 'ie31200_edac' 'IE31200': DEV 0000:00:00.0 [ 5.225377] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4 [ 5.226898] i801_smbus 0000:00:1f.4: enabling device (0000 -> 0003) [ 5.227035] i801_smbus 0000:00:1f.4: SMBus using PCI interrupt [ 5.231212] thinkpad_acpi: ThinkPad ACPI Extras v0.25 [ 5.231215] thinkpad_acpi: http://ibm-acpi.sf.net/

[ 5.648697] iwlwifi 0000:04:00.0: base HW address: 34:f3:9a:47:71:4c [ 5.728653] ieee80211 phy0: Selected rate control algorithm 'iwl-mvm-rs' [ 5.729036] thermal thermalzone1: failed to read out thermal zone 1 [ 5.757371] RPC: Registered named UNIX socket transport module. [ 5.757373] RPC: Registered udp transport module. [ 5.757375] RPC: Registered tcp transport module. [ 5.757376] RPC: Registered tcp NFSv4.1 backchannel transport module. [ 6.053267] [drm] [nvidia-drm] [GPU ID 0x00000100] Unloading driver [ 6.074111] nvidia-modeset: Unloading [ 6.075703] nvidia-nvlink: Unregistered the Nvlink Core, major device number 239 [ 6.082011] bbswitch: disabling discrete graphics [ 6.082041] ACPI Warning: _SB.PCI0.PEG0.PEGP._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20130517/nsarguments-95) [ 6.369885] ip6_tables: (C) 2000-2006 Netfilter Core Team [ 6.399825] Ebtables v2.0 registered [ 6.452471] nf_conntrack version 0.5.0 (65536 buckets, 262144 max) [ 6.511870] IPv6: ADDRCONF(NETDEV_UP): enp0s31f6: link is not ready [ 6.582818] psmouse serio2: trackpoint: IBM TrackPoint firmware: 0x0e, buttons: 3/3 [ 6.604800] e1000e 0000:00:1f.6: irq 132 for MSI/MSI-X [ 6.705572] e1000e 0000:00:1f.6: irq 132 for MSI/MSI-X [ 6.705745] IPv6: ADDRCONF(NETDEV_UP): enp0s31f6: link is not ready [ 6.718205] IPv6: ADDRCONF(NETDEV_UP): wlp4s0: link is not ready

[ 16.310070] tun: (C) 1999-2004 Max Krasnyansky maxk@qualcomm.com [ 16.312966] virbr0: port 1(virbr0-nic) entered blocking state [ 16.312969] virbr0: port 1(virbr0-nic) entered disabled state [ 16.313023] device virbr0-nic entered promiscuous mode [ 16.433707] virbr0: port 1(virbr0-nic) entered blocking state [ 16.433711] virbr0: port 1(virbr0-nic) entered listening state [ 16.433772] IPv6: ADDRCONF(NETDEV_UP): virbr0: link is not ready [ 16.518152] virbr0: port 1(virbr0-nic) entered disabled state [ 32.025299] fuse init (API version 7.22) [ 35.902551] rfkill: input handler disabled [ 139.141483] bbswitch: enabling discrete graphics [ 139.452242] nvidia-nvlink: Nvlink Core is being initialized, major device number 239 [ 139.452435] vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=none,decodes=none:owns=none [ 139.452500] NVRM: loading NVIDIA UNIX x8664 Kernel Module 390.77 Tue Jul 10 18:28:52 PDT 2018 (using threaded interrupts) [ 139.494217] nvidia 0000:01:00.0: irq 144 for MSI/MSI-X [ 139.515138] ACPI Warning: _SB.PCI0.PEG0.PEGP.DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20130517/nsarguments-95) [ 139.515217] ACPI Warning: _SB.PCI0.PEG0.PEGP.DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20130517/nsarguments-95) [ 139.515288] ACPI Warning: _SB.PCI0.PEG0.PEGP.DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20130517/nsarguments-95) [ 139.515353] ACPI Warning: _SB.PCI0.PEG0.PEGP.DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20130517/nsarguments-95) [ 139.515442] ACPI Warning: _SB.PCI0.PEG0.PEGP.DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20130517/nsarguments-95) [ 139.515551] ACPI Warning: _SB.PCI0.PEG0.PEGP._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20130517/nsarguments-95)

Below are the relevant log files: Xorg.8.log Xorg.0.log

  Does your campus have any people using Centos 7.5 with a Lenovo Thinkpad P50 with the GPU option that may be experimented with?  If not, I installed the debuginfo rpm.  May you provide some guidance in how to properly debug a kernel driver so that I may try it and report the findings?
  Thank you again for your assistance.
gsgatlin commented 5 years ago

I will ask on a mailing list we have if anyone has a Lenovo Thinkpad P50 I could borrow for a few hours. It seems unlikely because we have a list of approved computer they usually buy but you never know until asking...

Here are some other ideas we can try.

Just out of curiosity, if you run:

cat /proc/acpi/bbswitch

do you get

0000:01:00.0 OFF

?

Just wanted to see if its at least powering off the card or keeping it on all the time... Or if the virtual file is even there...We know you can't play 3d video games with optirun (optirun -b virtualgl) / primus not working but does it at least save power and improve battery life? (nouveau would likely also do that on centos 7, but not centos 6, unless your discrete graphics is too new for red hat to have backported that code to the 3.10 kernel yet)

Does

optirun -b primus glxgears -info

Give similar errors? How about

primusrun glxgears -info

?

lsmod | grep nouveau

should return nothing (no output) if its been successfully blacklisted in the kernel command line. I think it will be the same but just wanted to check. If nouveau shows up in the output from lsmod that would be why you see "Failed to initialize the NVIDIA GPU" message.

I don't have centos 7 box in front of me at the moment. ( I left it at work today, I use fedora 27 on this box)

But as root on my thinkpad t540p fedora box:

[root@t540p ~]# modprobe --verbose nvidia insmod /lib/modules/4.17.9-100.fc27.x86_64/kernel/drivers/video/nvidia.ko [root@t540p ~]# cat /proc/acpi/bbswitch 0000:01:00.0 ON

[gsgatlin@t540p ~]$ lsmod | grep nvidia nvidia 14397440 0 ipmi_msghandler 57344 2 ipmi_devintf,nvidia

[root@t540p ~]# modprobe -r nvidia root@t540p ~]# systemctl restart bumblebeed [root@t540p ~]# cat /proc/acpi/bbswitch 0000:01:00.0 OFF

[gsgatlin@t540p ~]$ lsmod | grep nvidia [gsgatlin@t540p ~]$

I wonder how your machine will behave with similar commands?

At least its giving you the correct problem now instead of a bogus error: "Response: No - error: [XORG] (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA GPU at PCI:1:0:0. Please"

As for why its failing to "initialize the NVIDIA GPU" that I'm not sure about. Can't find much about that particular error message. I will probably make bumblebee-3.2.1-14 the default version in my yum repo since the error messages are clearer. Might save some people (and myself If I ever get new laptop) some trouble with troubleshooting these kinds of problems...

I will see if I can build "develop" branches of bumblebee and or bbswitch tomorrow if there is time. I will make a second branch in my github repository for that. Maybe those "develop branch" rpms could help?

I thing the debuginfo rpms are only useful if it segfaults or similar. Which I gather its not doing on your machine. That is because they "strip" the binarys in the rpms.

It may be that my "bumblebee-nvida --check" section could do other checking operations on the nvidia module that are not clear to me at the moment. Perhaps something that could show a problem on your laptop. Like maybe the elrepo nvidia driver did not get completely removed somehow or something. I will give it further thought.

gsgatlin commented 5 years ago

Couple more thing I thought of, one of which that burned me in desktop setup with network home directory recently.

Do you have ".nvidia-settings-rc" and or nv/ directory in your $HOME directory? If so you may want to delete those.

Can you psatebin your

/etc/bumblebee/xorg.conf.nvidia

file?

In the Xorg.8.log.txt file you uploaded i didn't see Option "IgnoreABI" "1"

so I wonder if the config file is different then mine...

147115.312] () Option "IgnoreABI" "1" [147115.312] () Option "AutoAddDevices" "false" [147115.312] () Option "AutoAddGPU" "false" [147115.312] () Ignoring ABI Version

wrthissell commented 5 years ago

Dear gsgatlin, Thank you for your assistance. Here are results of your latest guidance:

[wrthissell@LAPTOP-BKIJEPGK ~]$ cat /proc/acpi/bbswitch 0000:01:00.0 ON

[wrthissell@LAPTOP-BKIJEPGK ~]$ optirun -b primus glxgears -info [40431.612917] [ERROR]Cannot access secondary GPU - error: [XORG] (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA GPU at PCI:1:0:0. Please

[40431.612953] [ERROR]Aborting because fallback start is disabled.

[wrthissell@LAPTOP-BKIJEPGK ~]$ primusrun glxgears -info primus: fatal: Bumblebee daemon reported: error: [XORG] (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA GPU at PCI:1:0:0. Please

[wrthissell@LAPTOP-BKIJEPGK ~]$ lsmod | grep nouveau [wrthissell@LAPTOP-BKIJEPGK ~]$

[wrthissell@LAPTOP-BKIJEPGK ~]$ modprobe --verbose nvidia [wrthissell@LAPTOP-BKIJEPGK ~]$ sudo modprobe --verbose nvidia [sudo] password for wrthissell: [wrthissell@LAPTOP-BKIJEPGK ~]$ cat /proc/acpi/bbswitch 0000:01:00.0 ON [wrthissell@LAPTOP-BKIJEPGK ~]$ lsmod | grep nvidia nvidia 14368630 0 ipmi_msghandler 46608 2 ipmi_devintf,nvidia i2c_core 63151 7 drm,i915,i2c_i801,drm_kms_helper,i2c_algo_bit,nvidia,videodev

I do not have ".nvidia-settings-rc" or nv/ directory in my $HOME directory.

Below is my /etc/bumblebee/xorg.conf.nvidia:

20180802 xorg.conf.nvidia.txt

The date of the above file is yesterday, which is when I executed the yum install for your experimental bumblebee rpm.

Below is the current boot Xorg.8.log file:

20180802 Xorg.8.log.txt It has the following line:

[ 40555.271] (**) IgnoreABIOption "AutoAddDevices" "false"

I also noticed the following line:

[ 40555.270] (==) Using system config directory "/usr/share/X11/xorg.conf.d"

Below is a zip file of this directory:

20180802 xorg.conf.d.zip

  Thank you again for your assistance.
gsgatlin commented 5 years ago

So it appears to me that your bbswitch module is not working. It seems the card always stays on.

Perhaps some kernel command line would help. If so I am utterly clueless about what that might be. It seems like you've tried them all.

What happens if you run:

tee /proc/acpi/bbswitch <<<OFF

as root? Does that change the status?

You can check with

cat /proc/acpi/bbswitch

(Either it will show on or off)

The "develop" branch of bbswitch is not very different from master branch. We are already patching the differences to that version on fedora with bbswitch-412.patch. On RHEL / CentOS kernel that patch is not needed due to those using older kernel with backported features from newer kernels.

There are a couple of branches that were last worked on about 2 years ago. "pm-rework" and "acpi-pr3"

So I tried to install the "pm-rework" branch of https://github.com/Bumblebee-Project/bbswitch/tree/pm-rework but I get this error while buidling on centos 7:

https://paste.fedoraproject.org/paste/F1B2wZN-XvJw0dVCVeXY7w

Using a version number I made up, bbswitch-dkms-0.8.5-1.el7.x86_64.rpm

I also tried "acpi-pr3" branch and that actually compiled and loaded ok. It appears to work on my lenovo ideapad y470.

Using a version number I made up, bbswitch-dkms-0.8.6-1.el7.x86_64.rpm

I'm not sure any of the other branches would be worth building since they are 5 and 7 years old.

In the interest of full disclosure, I have uploaded both sets of packages to:

(not working for me, pm-rework) https://linux.itecs.ncsu.edu/redhat/public/bumblebee/experimental2/pm-rework/

(appears to work for me, acpi-pr3) https://linux.itecs.ncsu.edu/redhat/public/bumblebee/experimental2/acpi-pr3/

So I guess you could give those a try...

If none of this helps you may wish to open a new issue report on https://github.com/Bumblebee-Project/bbswitch and perhaps the author, @Lekensteyn might have some ideas? If so reference this issue so they can see what has been found out so far.

On the bumblebee side there is a "xpra-backend" branch that was updated 23 days ago by @Thulinma but I don't think it would be useful to you since it simply adds xpra support and xpra package is not available for centos 7. (first attempt at Vulcan API games support) If they (The bumblebee developers) get that sorted out I was going to look into how hard it might be to build xpra for centos 7 and add it to my yum repo. And patch the bumblebee rpm also.

Also on the bumblebee side there is a "develop" branch which might be useful... ? I think more work would be needed to get this working on CentOS 6 / RHEL 6 so I will not be able to copy those rpms. It could probably be made to work I'm guessing.... Right now it will not even compile.

It worked on CentOS 7 but I had to disable selinux. So clearly more work would need to be done in the bumblebee-nvidia package with its selinux policy module to make it work with selinux in enforcing mode on RHEL 7 and fedora. I will try to look into that when I have more time. (for now /etc/sysconfig/selinux SELINUX=permissive )

Using a version number I made up, bumblebee-3.3.0-1.el7.x86_64.rpm

https://linux.itecs.ncsu.edu/redhat/public/bumblebee/experimental2/develop/

So I guess you could give that a try...

Perhaps it could help?

Your 20180802.xorg.conf.nvidia.txt file looks fine since I forgot I only patch that on fedora, not RHEL or CentOS.

wrthissell commented 5 years ago

Dear gsgatlin, Thank you for your assistance. Here is the results of your recommend commands with the version you built a couple of days ago:

[wrthissell@LAPTOP-BKIJEPGK ~]$ cat /proc/acpi/bbswitch 0000:01:00.0 ON [wrthissell@LAPTOP-BKIJEPGK ~]$ sudo tee /proc/acpi/bbswitch <<<OFF OFF [wrthissell@LAPTOP-BKIJEPGK ~]$ cat /proc/acpi/bbswitch 0000:01:00.0 ON It appears to me that I am unable to turn off the GPU. I am installing the bbswitch-dkms-0.8.5-1.el7.x86_64.rpm now and I will report the results in a new comment after it finishes and I reboot.

wrthissell commented 5 years ago

Dear gsgatlin, Here are the results from the above mentioned bbswitch version install, after setting the selinux to permissive:

[wrthissell@LAPTOP-BKIJEPGK ~]$ bumblebee-nvidia --check

nvidia.ko compiled into in the kernel tree ok. modinfo output for NVIDIA:

filename: /lib/modules/3.10.0-862.9.1.el7.x86_64/kernel/drivers/video/nvidia.ko alias: char-major-195- version: 390.77 supported: external license: NVIDIA retpoline: Y rhelversion: 7.5 srcversion: 209B1D0CB123DE466F700AD alias: pci:v000010DEd00000E00svsdbc04sc80i00 alias: pci:v000010DEdsvsdbc03sc02i00 alias: pci:v000010DEdsvsdbc03sc00i00 depends: ipmi_msghandler,i2c-core vermagic: 3.10.0-862.9.1.el7.x86_64 SMP mod_unload modversions parm: NVreg_Mobile:int parm: NVreg_ResmanDebugLevel:int parm: NVreg_RmLogonRC:int parm: NVreg_ModifyDeviceFiles:int parm: NVreg_DeviceFileUID:int parm: NVreg_DeviceFileGID:int parm: NVreg_DeviceFileMode:int parm: NVreg_UpdateMemoryTypes:int parm: NVreg_InitializeSystemMemoryAllocations:int parm: NVreg_UsePageAttributeTable:int parm: NVreg_MapRegistersEarly:int parm: NVreg_RegisterForACPIEvents:int parm: NVreg_CheckPCIConfigSpace:int parm: NVreg_EnablePCIeGen3:int parm: NVreg_EnableMSI:int parm: NVreg_TCEBypassMode:int parm: NVreg_UseThreadedInterrupts:int parm: NVreg_EnableStreamMemOPs:int parm: NVreg_EnableBacklightHandler:int parm: NVreg_EnableUserNUMAManagement:int parm: NVreg_EnableIBMNPURelaxedOrderingMode:int parm: NVreg_MemoryPoolSize:int parm: NVreg_IgnoreMMIOCheck:int parm: NVreg_RegistryDwords:charp parm: NVreg_RegistryDwordsPerDevice:charp parm: NVreg_RmMsg:charp parm: NVreg_AssignGpus:charp

Check bbswitch kernel module...

bbswitch is loaded into the current kernel ok.

All checks completed successfully! NVIDIA driver appears to have compiled ok.

Documentation on bumblebee for RHEL / CentOS / fedora can be found at: https://www.linux.ncsu.edu/bumblebee/

[wrthissell@LAPTOP-BKIJEPGK ~]$ optirun --debug glxgears [ 707.181975] [DEBUG]optirun version 3.2.1 starting... [ 707.182007] [DEBUG]Active configuration: [ 707.182016] [DEBUG] bumblebeed config file: /etc/bumblebee/bumblebee.conf [ 707.182025] [DEBUG] X display: :8 [ 707.182041] [DEBUG] LD_LIBRARY_PATH: /usr/lib64/nvidia-bumblebee:/usr/lib/nvidia-bumblebee:/usr/lib64:/usr/lib [ 707.182050] [DEBUG] Socket path: /var/run/bumblebee.socket [ 707.182062] [DEBUG] Accel/display bridge: auto [ 707.182075] [DEBUG] VGL Compression: proxy [ 707.182084] [DEBUG] VGLrun extra options: [ 707.182092] [DEBUG] Primus LD Path: /usr/lib/primus:/usr/lib64/primus [ 707.182197] [DEBUG]Using auto-detected bridge virtualgl [ 712.319627] [INFO]Response: No - error: [XORG] (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA GPU at PCI:1:0:0. Please

[ 712.319668] [ERROR]Cannot access secondary GPU - error: [XORG] (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA GPU at PCI:1:0:0. Please

[ 712.319688] [DEBUG]Socket closed. [ 712.319744] [ERROR]Aborting because fallback start is disabled. [ 712.319762] [DEBUG]Killing all remaining processes. [wrthissell@LAPTOP-BKIJEPGK ~]$ optirun -b primus glxgears -info [ 738.304601] [ERROR]Cannot access secondary GPU - error: [XORG] (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA GPU at PCI:1:0:0. Please

[ 738.304649] [ERROR]Aborting because fallback start is disabled. [wrthissell@LAPTOP-BKIJEPGK ~]$ cat /proc/acpi/bbswitch 0000:01:00.0 ON [wrthissell@LAPTOP-BKIJEPGK ~]$ sudo tee /proc/acpi/bbswitch <<<OFF OFF [wrthissell@LAPTOP-BKIJEPGK ~]$ cat /proc/acpi/bbswitch 0000:01:00.0 ON

 Below are the log files for this installation:

20180803 Xorg.8.log.txt 20180803 Xorg.0.log.txt

 I shall now install the bbswitch-dkms-0.8.6-1.el7.x86_64.rpm, reboot, and try again.
 Thank you again for your assistance.
wrthissell commented 5 years ago

Dear gsgatlin, Thank you for your assistance. Here are the results from the above mentioned bbswitch rpm: [wrthissell@LAPTOP-BKIJEPGK ~]$ bumblebee-nvidia --check

nvidia.ko compiled into in the kernel tree ok. modinfo output for NVIDIA:

filename: /lib/modules/3.10.0-862.9.1.el7.x86_64/kernel/drivers/video/nvidia.ko alias: char-major-195- version: 390.77 supported: external license: NVIDIA retpoline: Y rhelversion: 7.5 srcversion: 209B1D0CB123DE466F700AD alias: pci:v000010DEd00000E00svsdbc04sc80i00 alias: pci:v000010DEdsvsdbc03sc02i00 alias: pci:v000010DEdsvsdbc03sc00i00 depends: ipmi_msghandler,i2c-core vermagic: 3.10.0-862.9.1.el7.x86_64 SMP mod_unload modversions parm: NVreg_Mobile:int parm: NVreg_ResmanDebugLevel:int parm: NVreg_RmLogonRC:int parm: NVreg_ModifyDeviceFiles:int parm: NVreg_DeviceFileUID:int parm: NVreg_DeviceFileGID:int parm: NVreg_DeviceFileMode:int parm: NVreg_UpdateMemoryTypes:int parm: NVreg_InitializeSystemMemoryAllocations:int parm: NVreg_UsePageAttributeTable:int parm: NVreg_MapRegistersEarly:int parm: NVreg_RegisterForACPIEvents:int parm: NVreg_CheckPCIConfigSpace:int parm: NVreg_EnablePCIeGen3:int parm: NVreg_EnableMSI:int parm: NVreg_TCEBypassMode:int parm: NVreg_UseThreadedInterrupts:int parm: NVreg_EnableStreamMemOPs:int parm: NVreg_EnableBacklightHandler:int parm: NVreg_EnableUserNUMAManagement:int parm: NVreg_EnableIBMNPURelaxedOrderingMode:int parm: NVreg_MemoryPoolSize:int parm: NVreg_IgnoreMMIOCheck:int parm: NVreg_RegistryDwords:charp parm: NVreg_RegistryDwordsPerDevice:charp parm: NVreg_RmMsg:charp parm: NVreg_AssignGpus:charp

Check bbswitch kernel module...

bbswitch is loaded into the current kernel ok.

All checks completed successfully! NVIDIA driver appears to have compiled ok.

Documentation on bumblebee for RHEL / CentOS / fedora can be found at: https://www.linux.ncsu.edu/bumblebee/

[wrthissell@LAPTOP-BKIJEPGK ~]$ optirun --debug glxgears [ 198.947153] [DEBUG]optirun version 3.2.1 starting... [ 198.947174] [DEBUG]Active configuration: [ 198.947179] [DEBUG] bumblebeed config file: /etc/bumblebee/bumblebee.conf [ 198.947183] [DEBUG] X display: :8 [ 198.947190] [DEBUG] LD_LIBRARY_PATH: /usr/lib64/nvidia-bumblebee:/usr/lib/nvidia-bumblebee:/usr/lib64:/usr/lib [ 198.947201] [DEBUG] Socket path: /var/run/bumblebee.socket [ 198.947206] [DEBUG] Accel/display bridge: auto [ 198.947210] [DEBUG] VGL Compression: proxy [ 198.947214] [DEBUG] VGLrun extra options: [ 198.947218] [DEBUG] Primus LD Path: /usr/lib/primus:/usr/lib64/primus [ 198.947325] [DEBUG]Using auto-detected bridge virtualgl [ 203.219371] [INFO]Response: No - error: [XORG] (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA GPU at PCI:1:0:0. Please

[ 203.219397] [ERROR]Cannot access secondary GPU - error: [XORG] (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA GPU at PCI:1:0:0. Please

[ 203.219408] [DEBUG]Socket closed. [ 203.219450] [ERROR]Aborting because fallback start is disabled. [ 203.219460] [DEBUG]Killing all remaining processes. [wrthissell@LAPTOP-BKIJEPGK ~]$ optirun -b primus glxgears -info [ 230.774879] [ERROR]Cannot access secondary GPU - error: [XORG] (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA GPU at PCI:1:0:0. Please

[ 230.774937] [ERROR]Aborting because fallback start is disabled. [wrthissell@LAPTOP-BKIJEPGK ~]$ cat /proc/acpi/bbswitch 0000:01:00.0 ON [wrthissell@LAPTOP-BKIJEPGK ~]$ sudo tee /proc/acpi/bbswitch <<<OFF OFF [wrthissell@LAPTOP-BKIJEPGK ~]$ cat /proc/acpi/bbswitch 0000:01:00.0 ON [wrthissell@LAPTOP-BKIJEPGK ~]$

Below are the relevant log files:

20180803 Xorg.8.log#2.txt 20180803 Xorg.0.log#2.txt

I shall now try to install bumblebee-3.3.0-1.el7.x86_64.rpm, reboot and try again.

  Thank you again for your assistance.
wrthissell commented 5 years ago

Dear gsgatlin, I am executing dracut --regenerate-all --force prior to reboot.

wrthissell commented 5 years ago

Dear gsgatlin, Thank you for your assistance. Here are the results after installing the above mentioned bumblebee rpm: [wrthissell@LAPTOP-BKIJEPGK ~]$ bumblebee-nvidia --check

nvidia.ko compiled into in the kernel tree ok. modinfo output for NVIDIA:

filename: /lib/modules/3.10.0-862.9.1.el7.x86_64/kernel/drivers/video/nvidia.ko alias: char-major-195- version: 390.77 supported: external license: NVIDIA retpoline: Y rhelversion: 7.5 srcversion: 209B1D0CB123DE466F700AD alias: pci:v000010DEd00000E00svsdbc04sc80i00 alias: pci:v000010DEdsvsdbc03sc02i00 alias: pci:v000010DEdsvsdbc03sc00i00 depends: ipmi_msghandler,i2c-core vermagic: 3.10.0-862.9.1.el7.x86_64 SMP mod_unload modversions parm: NVreg_Mobile:int parm: NVreg_ResmanDebugLevel:int parm: NVreg_RmLogonRC:int parm: NVreg_ModifyDeviceFiles:int parm: NVreg_DeviceFileUID:int parm: NVreg_DeviceFileGID:int parm: NVreg_DeviceFileMode:int parm: NVreg_UpdateMemoryTypes:int parm: NVreg_InitializeSystemMemoryAllocations:int parm: NVreg_UsePageAttributeTable:int parm: NVreg_MapRegistersEarly:int parm: NVreg_RegisterForACPIEvents:int parm: NVreg_CheckPCIConfigSpace:int parm: NVreg_EnablePCIeGen3:int parm: NVreg_EnableMSI:int parm: NVreg_TCEBypassMode:int parm: NVreg_UseThreadedInterrupts:int parm: NVreg_EnableStreamMemOPs:int parm: NVreg_EnableBacklightHandler:int parm: NVreg_EnableUserNUMAManagement:int parm: NVreg_EnableIBMNPURelaxedOrderingMode:int parm: NVreg_MemoryPoolSize:int parm: NVreg_IgnoreMMIOCheck:int parm: NVreg_RegistryDwords:charp parm: NVreg_RegistryDwordsPerDevice:charp parm: NVreg_RmMsg:charp parm: NVreg_AssignGpus:charp

Check bbswitch kernel module...

bbswitch is loaded into the current kernel ok.

All checks completed successfully! NVIDIA driver appears to have compiled ok.

Documentation on bumblebee for RHEL / CentOS / fedora can be found at: https://www.linux.ncsu.edu/bumblebee/

[wrthissell@LAPTOP-BKIJEPGK ~]$ optirun --debug glxgears [ 236.239915] [DEBUG]Reading file: /etc/bumblebee/bumblebee.conf [ 236.240335] [INFO]Configured driver: nvidia [ 236.241188] [DEBUG]optirun version 3.3.0-:%ci-:%h$ starting... [ 236.241205] [DEBUG]Active configuration: [ 236.241221] [DEBUG] bumblebeed config file: /etc/bumblebee/bumblebee.conf [ 236.241240] [DEBUG] X display: :8 [ 236.241252] [DEBUG] LD_LIBRARY_PATH: /usr/lib64/nvidia-bumblebee:/usr/lib/nvidia-bumblebee:/usr/lib64:/usr/lib [ 236.241262] [DEBUG] Socket path: /var/run/bumblebee.socket [ 236.241274] [DEBUG] Accel/display bridge: auto [ 236.241286] [DEBUG] VGL Compression: proxy [ 236.241297] [DEBUG] VGLrun extra options: [ 236.241308] [DEBUG] Primus LD Path: /usr/lib/primus:/usr/lib64/primus [ 236.241381] [DEBUG]Using auto-detected bridge primus [ 240.703228] [INFO]Response: No - error: [XORG] (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA GPU at PCI:1:0:0. Please

[ 240.703241] [ERROR]Cannot access secondary GPU - error: [XORG] (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA GPU at PCI:1:0:0. Please

[ 240.703246] [DEBUG]Socket closed. [ 240.703259] [ERROR]Aborting because fallback start is disabled. [ 240.703263] [DEBUG]Killing all remaining processes. [wrthissell@LAPTOP-BKIJEPGK ~]$ cat /proc/acpi/bbswitch 0000:01:00.0 ON [wrthissell@LAPTOP-BKIJEPGK ~]$ sudo tee /proc/acpi/bbswitch <<<OFF OFF [wrthissell@LAPTOP-BKIJEPGK ~]$ cat /proc/acpi/bbswitch 0000:01:00.0 ON [wrthissell@LAPTOP-BKIJEPGK ~]$

Below are the relevant log files:

20180803 Xorg.8.log#3.txt 20180803 Xorg.0.log#3.txt

  I shall now create an issue at https://github.com/Bumblebee-Project/bbswitch/issues and reference this issue.
  Thank you again for your assistance. 
wrthissell commented 5 years ago

Dear gsgatlin, I just read the following bbswitch issues:

https://github.com/Bumblebee-Project/bbswitch/issues/112 https://github.com/Bumblebee-Project/bbswitch/issues/96 https://github.com/Bumblebee-Project/bbswitch/issues/120#issuecomment-180095778 https://github.com/Bumblebee-Project/bbswitch/issues/111

I just added the following text to the kernel load command:

'acpi_osi=!Windows\x202013' acpi_osi=Linux nogpumanager intel_iommu=on

I shall reboot and see if this fixes the issue.

wrthissell commented 5 years ago

Dear gsgatlin, Thank you for your assistance. The issue below might also be relevant:

https://github.com/Bumblebee-Project/bbswitch/issues/119 I noticed that doudou has a bbswitch version that addressed some issues:

https://github.com/doudou/bbswitch

   Are any of the changes listed above part of your bbswitch package?
   Thank you again for your assistance.
wrthissell commented 5 years ago

Dear gsgatlin, Thank you for your assistance. The above mentioned kernel load option:

'acpi_osi=!Windows\x202013' acpi_osi=Linux nogpumanager intel_iommu=on

   did not resolve the issue:

[wrthissell@LAPTOP-BKIJEPGK ~]$ optirun --debug glxgears [ 137.068785] [DEBUG]Reading file: /etc/bumblebee/bumblebee.conf [ 137.070300] [INFO]Configured driver: nvidia [ 137.071355] [DEBUG]optirun version 3.3.0-:%ci-:%h$ starting... [ 137.071373] [DEBUG]Active configuration: [ 137.071383] [DEBUG] bumblebeed config file: /etc/bumblebee/bumblebee.conf [ 137.071393] [DEBUG] X display: :8 [ 137.071408] [DEBUG] LD_LIBRARY_PATH: /usr/lib64/nvidia-bumblebee:/usr/lib/nvidia-bumblebee:/usr/lib64:/usr/lib [ 137.071420] [DEBUG] Socket path: /var/run/bumblebee.socket [ 137.071434] [DEBUG] Accel/display bridge: auto [ 137.071446] [DEBUG] VGL Compression: proxy [ 137.071459] [DEBUG] VGLrun extra options: [ 137.071471] [DEBUG] Primus LD Path: /usr/lib/primus:/usr/lib64/primus [ 137.071545] [DEBUG]Using auto-detected bridge primus [ 141.901752] [INFO]Response: No - error: [XORG] (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA GPU at PCI:1:0:0. Please

[ 141.901769] [ERROR]Cannot access secondary GPU - error: [XORG] (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA GPU at PCI:1:0:0. Please

[ 141.901775] [DEBUG]Socket closed. [ 141.901809] [ERROR]Aborting because fallback start is disabled. [ 141.901815] [DEBUG]Killing all remaining processes. [wrthissell@LAPTOP-BKIJEPGK ~]$ bumblebee-nvidia --check

nvidia.ko compiled into in the kernel tree ok. modinfo output for NVIDIA:

filename: /lib/modules/3.10.0-862.9.1.el7.x86_64/kernel/drivers/video/nvidia.ko alias: char-major-195- version: 390.77 supported: external license: NVIDIA retpoline: Y rhelversion: 7.5 srcversion: 209B1D0CB123DE466F700AD alias: pci:v000010DEd00000E00svsdbc04sc80i00 alias: pci:v000010DEdsvsdbc03sc02i00 alias: pci:v000010DEdsvsdbc03sc00i00 depends: ipmi_msghandler,i2c-core vermagic: 3.10.0-862.9.1.el7.x86_64 SMP mod_unload modversions parm: NVreg_Mobile:int parm: NVreg_ResmanDebugLevel:int parm: NVreg_RmLogonRC:int parm: NVreg_ModifyDeviceFiles:int parm: NVreg_DeviceFileUID:int parm: NVreg_DeviceFileGID:int parm: NVreg_DeviceFileMode:int parm: NVreg_UpdateMemoryTypes:int parm: NVreg_InitializeSystemMemoryAllocations:int parm: NVreg_UsePageAttributeTable:int parm: NVreg_MapRegistersEarly:int parm: NVreg_RegisterForACPIEvents:int parm: NVreg_CheckPCIConfigSpace:int parm: NVreg_EnablePCIeGen3:int parm: NVreg_EnableMSI:int parm: NVreg_TCEBypassMode:int parm: NVreg_UseThreadedInterrupts:int parm: NVreg_EnableStreamMemOPs:int parm: NVreg_EnableBacklightHandler:int parm: NVreg_EnableUserNUMAManagement:int parm: NVreg_EnableIBMNPURelaxedOrderingMode:int parm: NVreg_MemoryPoolSize:int parm: NVreg_IgnoreMMIOCheck:int parm: NVreg_RegistryDwords:charp parm: NVreg_RegistryDwordsPerDevice:charp parm: NVreg_RmMsg:charp parm: NVreg_AssignGpus:charp

Check bbswitch kernel module...

bbswitch is loaded into the current kernel ok.

All checks completed successfully! NVIDIA driver appears to have compiled ok.

Documentation on bumblebee for RHEL / CentOS / fedora can be found at: https://www.linux.ncsu.edu/bumblebee/

[wrthissell@LAPTOP-BKIJEPGK ~]$ cat /proc/acpi/bbswitch 0000:01:00.0 ON [wrthissell@LAPTOP-BKIJEPGK ~]$ sudo tee /proc/acpi/bbswitch <<<OFF OFF [wrthissell@LAPTOP-BKIJEPGK ~]$ cat /proc/acpi/bbswitch 0000:01:00.0 ON

  Thank you again for your assistance.
wrthissell commented 5 years ago

Dear gsgatlin, Thank you for your assistance. I tried the advise at:

https://www.centos.org/forums/viewtopic.php?t=66178

after removing the kernel load option and executing the dracut --regenerate-all --force:

'acpi_osi=!Windows\x202013' acpi_osi=Linux nogpumanager intel_iommu=on

Here are the results (no change):

[wrthissell@LAPTOP-BKIJEPGK ~]$ optirun --debug glxgears [ 3272.811168] [DEBUG]Reading file: /etc/bumblebee/bumblebee.conf [ 3272.811442] [INFO]Configured driver: nvidia [ 3272.812289] [DEBUG]optirun version 3.3.0-:%ci-:%h$ starting... [ 3272.812306] [DEBUG]Active configuration: [ 3272.812316] [DEBUG] bumblebeed config file: /etc/bumblebee/bumblebee.conf [ 3272.812326] [DEBUG] X display: :8 [ 3272.812342] [DEBUG] LD_LIBRARY_PATH: /usr/lib64/nvidia-bumblebee:/usr/lib/nvidia-bumblebee:/usr/lib64:/usr/lib [ 3272.812354] [DEBUG] Socket path: /var/run/bumblebee.socket [ 3272.812365] [DEBUG] Accel/display bridge: auto [ 3272.812375] [DEBUG] VGL Compression: proxy [ 3272.812388] [DEBUG] VGLrun extra options: [ 3272.812399] [DEBUG] Primus LD Path: /usr/lib/primus:/usr/lib64/primus [ 3272.812643] [DEBUG]Using auto-detected bridge primus [ 3277.503194] [INFO]Response: No - error: [XORG] (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA GPU at PCI:1:0:0. Please

[ 3277.503238] [ERROR]Cannot access secondary GPU - error: [XORG] (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA GPU at PCI:1:0:0. Please

[ 3277.503261] [DEBUG]Socket closed. [ 3277.503313] [ERROR]Aborting because fallback start is disabled. [ 3277.503331] [DEBUG]Killing all remaining processes. [wrthissell@LAPTOP-BKIJEPGK ~]$ bumblebee-nvidia --check

nvidia.ko compiled into in the kernel tree ok. modinfo output for NVIDIA:

filename: /lib/modules/3.10.0-862.9.1.el7.x86_64/kernel/drivers/video/nvidia.ko alias: char-major-195- version: 390.77 supported: external license: NVIDIA retpoline: Y rhelversion: 7.5 srcversion: 209B1D0CB123DE466F700AD alias: pci:v000010DEd00000E00svsdbc04sc80i00 alias: pci:v000010DEdsvsdbc03sc02i00 alias: pci:v000010DEdsvsdbc03sc00i00 depends: ipmi_msghandler,i2c-core vermagic: 3.10.0-862.9.1.el7.x86_64 SMP mod_unload modversions parm: NVreg_Mobile:int parm: NVreg_ResmanDebugLevel:int parm: NVreg_RmLogonRC:int parm: NVreg_ModifyDeviceFiles:int parm: NVreg_DeviceFileUID:int parm: NVreg_DeviceFileGID:int parm: NVreg_DeviceFileMode:int parm: NVreg_UpdateMemoryTypes:int parm: NVreg_InitializeSystemMemoryAllocations:int parm: NVreg_UsePageAttributeTable:int parm: NVreg_MapRegistersEarly:int parm: NVreg_RegisterForACPIEvents:int parm: NVreg_CheckPCIConfigSpace:int parm: NVreg_EnablePCIeGen3:int parm: NVreg_EnableMSI:int parm: NVreg_TCEBypassMode:int parm: NVreg_UseThreadedInterrupts:int parm: NVreg_EnableStreamMemOPs:int parm: NVreg_EnableBacklightHandler:int parm: NVreg_EnableUserNUMAManagement:int parm: NVreg_EnableIBMNPURelaxedOrderingMode:int parm: NVreg_MemoryPoolSize:int parm: NVreg_IgnoreMMIOCheck:int parm: NVreg_RegistryDwords:charp parm: NVreg_RegistryDwordsPerDevice:charp parm: NVreg_RmMsg:charp parm: NVreg_AssignGpus:charp

Check bbswitch kernel module...

bbswitch is loaded into the current kernel ok.

All checks completed successfully! NVIDIA driver appears to have compiled ok.

Documentation on bumblebee for RHEL / CentOS / fedora can be found at: https://www.linux.ncsu.edu/bumblebee/

[wrthissell@LAPTOP-BKIJEPGK ~]$ cat /proc/acpi/bbswitch 0000:01:00.0 ON [wrthissell@LAPTOP-BKIJEPGK ~]$ sudo tee /proc/acpi/bbswitch <<<OFF [sudo] password for wrthissell: OFF [wrthissell@LAPTOP-BKIJEPGK ~]$ cat /proc/acpi/bbswitch 0000:01:00.0 ON

Thank you again for your assistance.
gsgatlin commented 5 years ago

I don't want to speak for anyone here but... Since there is silence... I think maybe most of the developers of this project are no longer all that interested. Plus I think a lot of them don't use bumblebee or bbswitch anymore and use nouveau drivers for power saving instead. Its hard to program something in your spare time if you don't use it any more. This is just my observations after making some bumblebee rpm packages for myself.

Maybe you could try irc.freenode.net channel bumblebee but... I'm not sure that could be any better. Its up to you.

Your best bet at this point might be to try to figure out how to make drivers at elrepo.org or rpmfusion.org or negativo17.org work if you are interested in 3d opengl gaming on Linux. I think this is where you have to reboot or at least log out and then log back in to switch drivers between intel and nvidia. I'm not really sure how it works in red hat.

Then if it doesn't work you can make an account at https://devtalk.nvidia.com/default/board/98/linux/ and post about your problem and a actual nvidia employee might be able to help. Or someone in the community. And then maybe that might help later with bumblebee but I'm not sure about that last part.

Like here are Simone Caronni's docs: (A.K.A. negativo17)

https://negativo17.org/nvidia-driver/ https://negativo17.org/category/nvidia/

I've never actually tested this yet but a lot of people like what he is doing. Look at the section called "Optimus laptops"

I'm pretty sure he supports RHEL 7 and CentOS 7 and fedora.

You might also want to check and see if some other distros have the same problem. Like you could plug in a external usb drive and boot from that and do an install onto that to see if ubuntu or fedora or whatever distro you pick gives you better luck with bumblebee or nvidia. And it would not destroy any data on your centos 7 hard drive partitions if you take care not to select the main laptop hard disk in anaconda or maybe ubuntu or whatever installer. Backup all data first just to be extra safe though.

Oh and also might want to upload your laptop info here?

DSDT.dsl and SSDT tables...

https://launchpad.net/~hybrid-graphics-linux https://bugs.launchpad.net/lpbugreporter/+bug/752542

maybe no one cares about that nowadays? But I guess it would cover all bases.

Red Hat can't put any kind of nvidia drivers in any of their distros (RHEL, CentOS, fedora) because it violates their rules about no closed source software being allowed in the distro ever. So because of that, third party yum/dnf repos are your only chance to get nvidia working and its very hit or miss with that. Some other distros like Arch or Ubuntu have no such rules so the drivers are better integrated into the distro maybe. (although ubuntu has a ppa for nvidia also so go figure...)

If all you care about is power saving maybe nouveau could work.

https://nouveau.freedesktop.org/wiki/Optimus/

Then you don't have to install anything "out of tree" and theoretically everything should just work right out of the box. Performance is not that great with nouveau PRIME but some people don't care about that and they never even use their DGPU.

Fundamentally IMHO the whole problem is caused by:

https://github.com/torvalds/linux/blob/master/Documentation/process/stable-api-nonsense.rst

Because bbswitch and nvidia modules are out of tree kernel modules. Like if nvidia would just follow the rules about GPL there would not be a need for bbswitch to even exist. And nvidia would just be a part of the kernel. And it would have power saving built in just like nouveau and raedon do.

I'm sorry I was not able to help figure out what was wrong here. I've never been a kernel hacker at all. I tried to help around here a little bit and give back some since they helped me so much back in the day but I am a person of limited troubleshooting ability sadly.

These are just my opinions and observations and are not a statement on behalf of bumblebee project...

wrthissell commented 5 years ago

Dear gsgatlin, Thank you very much for your assistance.
I rebuilt your bbswitch-dkms build below by adding doudou's thinkpad patch:

https://linux.itecs.ncsu.edu/redhat/public/bumblebee/experimental2/acpi-pr3/ https://github.com/doudou/bbswitch

    This particular build did not resolve the issue, however, I suspect why:

[wrthissell@LAPTOP-BKIJEPGK ~]$ bumblebee-nvidia --check

nvidia.ko compiled into in the kernel tree ok. modinfo output for NVIDIA:

filename: /lib/modules/3.10.0-862.9.1.el7.x86_64/kernel/drivers/video/nvidia.ko alias: char-major-195- version: 390.77 supported: external license: NVIDIA retpoline: Y rhelversion: 7.5 srcversion: 209B1D0CB123DE466F700AD alias: pci:v000010DEd00000E00svsdbc04sc80i00 alias: pci:v000010DEdsvsdbc03sc02i00 alias: pci:v000010DEdsvsdbc03sc00i00 depends: ipmi_msghandler,i2c-core vermagic: 3.10.0-862.9.1.el7.x86_64 SMP mod_unload modversions parm: NVreg_Mobile:int parm: NVreg_ResmanDebugLevel:int parm: NVreg_RmLogonRC:int parm: NVreg_ModifyDeviceFiles:int parm: NVreg_DeviceFileUID:int parm: NVreg_DeviceFileGID:int parm: NVreg_DeviceFileMode:int parm: NVreg_UpdateMemoryTypes:int parm: NVreg_InitializeSystemMemoryAllocations:int parm: NVreg_UsePageAttributeTable:int parm: NVreg_MapRegistersEarly:int parm: NVreg_RegisterForACPIEvents:int parm: NVreg_CheckPCIConfigSpace:int parm: NVreg_EnablePCIeGen3:int parm: NVreg_EnableMSI:int parm: NVreg_TCEBypassMode:int parm: NVreg_UseThreadedInterrupts:int parm: NVreg_EnableStreamMemOPs:int parm: NVreg_EnableBacklightHandler:int parm: NVreg_EnableUserNUMAManagement:int parm: NVreg_EnableIBMNPURelaxedOrderingMode:int parm: NVreg_MemoryPoolSize:int parm: NVreg_IgnoreMMIOCheck:int parm: NVreg_RegistryDwords:charp parm: NVreg_RegistryDwordsPerDevice:charp parm: NVreg_RmMsg:charp parm: NVreg_AssignGpus:charp

Check bbswitch kernel module...

bbswitch is loaded into the current kernel ok.

All checks completed successfully! NVIDIA driver appears to have compiled ok.

Documentation on bumblebee for RHEL / CentOS / fedora can be found at: https://www.linux.ncsu.edu/bumblebee/

[wrthissell@LAPTOP-BKIJEPGK ~]$ optirun -b none nvidia-settings -c :8 [ 227.075797] [ERROR]Cannot access secondary GPU - error: [XORG] (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA GPU at PCI:1:0:0. Please

[ 227.075836] [ERROR]Aborting because fallback start is disabled. [wrthissell@LAPTOP-BKIJEPGK ~]$ optirun --debug glxgears [ 249.312869] [DEBUG]Reading file: /etc/bumblebee/bumblebee.conf [ 249.313335] [INFO]Configured driver: nvidia [ 249.313803] [DEBUG]optirun version 3.3.0-:%ci-:%h$ starting... [ 249.313815] [DEBUG]Active configuration: [ 249.313825] [DEBUG] bumblebeed config file: /etc/bumblebee/bumblebee.conf [ 249.313834] [DEBUG] X display: :8 [ 249.313843] [DEBUG] LD_LIBRARY_PATH: /usr/lib64/nvidia-bumblebee:/usr/lib/nvidia-bumblebee:/usr/lib64:/usr/lib [ 249.313853] [DEBUG] Socket path: /var/run/bumblebee.socket [ 249.313864] [DEBUG] Accel/display bridge: auto [ 249.313874] [DEBUG] VGL Compression: proxy [ 249.313884] [DEBUG] VGLrun extra options: [ 249.313893] [DEBUG] Primus LD Path: /usr/lib/primus:/usr/lib64/primus [ 249.320323] [DEBUG]Using auto-detected bridge primus [ 253.424054] [INFO]Response: No - error: [XORG] (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA GPU at PCI:1:0:0. Please

[ 253.424066] [ERROR]Cannot access secondary GPU - error: [XORG] (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA GPU at PCI:1:0:0. Please

[ 253.424071] [DEBUG]Socket closed. [ 253.424087] [ERROR]Aborting because fallback start is disabled. [ 253.424091] [DEBUG]Killing all remaining processes. [wrthissell@LAPTOP-BKIJEPGK ~]$ yum list installed bbswitch* Loaded plugins: fastestmirror, langpacks, nvidia Loading mirror speeds from cached hostfile

20180812 bbswitch-0.8.7-1.tar.gz

   In it is a bbswitch-thinkpad.patch file that combines the pr3 and doudou's patch.  I suspect that the two patch's conflict from the results shown above.  I shall now try to add  doudou's patch to the previous bbswitch-dkms version without the pr3 patch and try again.
   I did upload the required information in May 2018 (initial report in Novermber 2017) at:

https://bugs.launchpad.net/lpbugreporter/+bug/752542

   I accomplished the required registration security process for that site this weekend.
   I am required to use Centos 7.x for this work, and it is for testing specific applications that use CUDA, hence I shall continue the debugging and searching for answers.
   Thank you again your assistance.
wrthissell commented 5 years ago

Dear gsgatlin, Here are the current results: I started with: bbswitch-dkms-0.8.0-3.el7.centos.src.rpm and I added the below attached bbswitch-thinkpad.patch (in the srpm) and rebuilt the rpm. Attached is a .gz file with the source and install able rpm and the relevant log files and terminal test results. One will note that after I installed the bumblebee-3.3.0-1.el7.x86_64.rpm, the command: optirun --debug glxgears changes the: Using auto-detected bridge virtualgl to: Using auto-detected bridge primus

The file below is before I installed xpra. Here is how to enable the xpra repository for Centos:

https://www.xpra.org/trac/wiki/Download#Linux

sudo yum install xpra* installs the latest version after placing the repo file where it belongs.

20180814 bbswitch-thinkpad.tar.gz

I shall now execute dracut --regenerate-all --force reboot and try again with xpra installed.

gsgatlin commented 5 years ago

Hmnn. seems bbswitch still is unable to turn off your discrete card even with thinkpad patch.

wrthissell commented 5 years ago

Dear gsgatlin, Prior to reboot, I just noticed that I had some SELinux errors related to the:

bumblebee-3.3.0-1.el7.x86_64.rpm

See below for my attempt at a remedy:

[wrthissell@LAPTOP-BKIJEPGK ~]$ sudo ausearch -c 'bumblebeed' --raw | audit2allow -M my-bumblebeed [sudo] password for wrthissell: Nothing to do [wrthissell@LAPTOP-BKIJEPGK ~]$ ausearch -c 'bumblebeed' --raw | audit2allow -M my-bumblebeed Error opening config file (Permission denied) NOTE - using built-in logs: /var/log/audit/audit.log Error opening /var/log/audit/audit.log (Permission denied) Nothing to do [wrthissell@LAPTOP-BKIJEPGK ~]$ sudo # ausearch -c 'bumblebeed' --raw | audit2allow -M my-bumblebeed usage: sudo -h | -K | -k | -V usage: sudo -v [-AknS] [-g group] [-h host] [-p prompt] [-u user] usage: sudo -l [-AknS] [-g group] [-h host] [-p prompt] [-U user] [-u user] [command] usage: sudo [-AbEHknPS] [-r role] [-t type] [-C num] [-g group] [-h host] [-p prompt] [-u user] [VAR=value] [-i|-s] [] usage: sudo -e [-AknS] [-r role] [-t type] [-C num] [-g group] [-h host] [-p prompt] [-u user] file ... [wrthissell@LAPTOP-BKIJEPGK ~]$ sudo semodule -i my-bumblebeed.pp [wrthissell@LAPTOP-BKIJEPGK ~]$ ausearch -c 'certwatch' --raw | audit2allow -M my-certwatch Error opening config file (Permission denied) NOTE - using built-in logs: /var/log/audit/audit.log Error opening /var/log/audit/audit.log (Permission denied) Nothing to do [wrthissell@LAPTOP-BKIJEPGK ~]$ sudo semodule -i my-certwatch.pp

I reported these to the Centos Bug Tracker.  I also reset SELinux from permissive to disabled to support the debugging process.

Now I shall execute the dracut --regenerate-all --force reboot and try again with xpra installed.

wrthissell commented 5 years ago

Dear gsgatlin, Prior to reboot, I just noticed that I had some SELinux errors related to the:

bumblebee-3.3.0-1.el7.x86_64.rpm

See below for my attempt at a remedy:

[wrthissell@LAPTOP-BKIJEPGK ~]$ sudo ausearch -c 'bumblebeed' --raw | audit2allow -M my-bumblebeed [sudo] password for wrthissell: Nothing to do [wrthissell@LAPTOP-BKIJEPGK ~]$ ausearch -c 'bumblebeed' --raw | audit2allow -M my-bumblebeed Error opening config file (Permission denied) NOTE - using built-in logs: /var/log/audit/audit.log Error opening /var/log/audit/audit.log (Permission denied) Nothing to do [wrthissell@LAPTOP-BKIJEPGK ~]$ sudo # ausearch -c 'bumblebeed' --raw | audit2allow -M my-bumblebeed usage: sudo -h | -K | -k | -V usage: sudo -v [-AknS] [-g group] [-h host] [-p prompt] [-u user] usage: sudo -l [-AknS] [-g group] [-h host] [-p prompt] [-U user] [-u user] [command] usage: sudo [-AbEHknPS] [-r role] [-t type] [-C num] [-g group] [-h host] [-p prompt] [-u user] [VAR=value] [-i|-s] [] usage: sudo -e [-AknS] [-r role] [-t type] [-C num] [-g group] [-h host] [-p prompt] [-u user] file ... [wrthissell@LAPTOP-BKIJEPGK ~]$ sudo semodule -i my-bumblebeed.pp [wrthissell@LAPTOP-BKIJEPGK ~]$ ausearch -c 'certwatch' --raw | audit2allow -M my-certwatch Error opening config file (Permission denied) NOTE - using built-in logs: /var/log/audit/audit.log Error opening /var/log/audit/audit.log (Permission denied) Nothing to do [wrthissell@LAPTOP-BKIJEPGK ~]$ sudo semodule -i my-certwatch.pp

I reported these to the Centos Bug Tracker.  I also reset SELinux from permissive to disabled to support the debugging process.

Now I shall execute the dracut --regenerate-all --force reboot and try again with xpra installed.

wrthissell commented 5 years ago

Dear gsgatlin, I have made some new rpms per the attached tar.gz file. I have posted an NVidia bug report at: https://devtalk.nvidia.com/default/topic/1039122/linux/lenovo-thinkpad-p50-bumblebee-nvidia-driver-load-issue/?offset=7#5279897 I am unable to upload the relevant tar.gz files to that webpage, hence I am uploading them here and posting a link to this page for NVidia Support.
20180905 bbswitch-0.9_bumblebee-3.3.0-4_bumblebee-nvidia-390.87_build_results.tar.gz

 I am unable to upload the 20180905 Bumblebee relevant rpms.tar.gz file because it is too large.  It includes new rpms for bumblebee-nvidia 390.87 build.  How may I send these rpms to you?

20180905 Bumblebee some relevant rpms.tar.gz to you?

wrthissell commented 5 years ago

Dear gsgatlin, I installed the kernel-ml 4.18.6-1 from elrepo and then erased and installed bumblebee, bumblebee-nvidia and bbswitch-dkms, and then executed sudo dracut --regenerate-all --force, and then rebooted. This has resolved the issue. I am now able to run optirun glxgears and use the nvidia driver. Attached is the tar.gz file with the documentation of this result and the bumblebee-nvidia.spec file for building the rpms for the 390.87 version of your driver, since the file size limitations prevent one from attaching the src.rpm and rpm files for these builds. I recommend that you review and revise this spec file and the other src.rpm's spec files to conform with your standards for posting in your repository for others to use. We now have a fix for getting bumblebee and the nvidia drivers to work with Centos 7.5 on a Lenovo Thinkpad P50. Thank you very much for your assistance. 20180909 bbswitch-0.9_bumblebee-3.3.0-4_bumblebee-nvidia-390.87_kernel-4.18.6.tar.gz

gsgatlin commented 5 years ago

Glad you got it working.