outpaddling / desktop-installer

Quickly configure a FreeBSD or NetBSD desktop system
BSD 2-Clause "Simplified" License
54 stars 7 forks source link

X now coming up but hanging on login #5

Closed pkgset closed 3 years ago

pkgset commented 3 years ago

Initial msg from X is no screen, more details below. OK, so I may have an unusual setup, but should not be too odd. It is a X10DRH-C/i Supermicro Workstation with an on-board VGA which is disabled in BIOS, It's booting EFI GPT. Fresh install 12.2-RELEASE-p4, ZFS, nvidia (GTX 1050..) 400-100 driver, no src, no ports. Freebsd-update fetch/install. pkg install desktop-installer, execute it. (I've tried it several times in case I'm doing something wrong but nothing have produced a different result.) Xorg.0.log: open /dev/dri/card0: no such file or directory

Indeed there's no /dev/dri dir.

To use the nvidia card: /usr/local/etc/X11/xorg.conf.d/nvidia.conf: Section "Device" Identifier "NVIDIA Card" VendorName "NVIDIA Corporation" Driver "nvidia" BusID "130:0:0" EndSection

I had everything working great in plasma under 12.1-RELEASE and did the update to 12.2 and then I noticed it would not logout from the plasma menu, I had to give a reboot/halt in terminal to stop it. So I looked around and found your desktop-installer script and went for it. That was two days ago... :) Today I decided to take you up on your link to let you know.

outpaddling commented 3 years ago

Couple of questions:

  1. Are you using quarterly packages or latest?
  2. Did you run the GPU setup from desktop installer and separately verify that you have the correct nVidia driver installed for your chipset? The driver should log a message in /var/log/messages such as the following:
Feb 28 09:29:09 tarpon kernel: nvidia0: <GeForce 9400> on vgapci0
Feb 28 09:29:09 tarpon kernel: vgapci0: child nvidia0 requested pci_enable_io

It should also tell you if you're using the wrong version of the driver. You can also go straight to the GPU setup outside desktop-installer by running auto-gpu-setup. I'd move the xorg.conf fragment out of the way for basic testing and only restore it if necessary.

pkgset commented 3 years ago
  1. Quarterly.
  2. Yes, it was all working before using the same driver.
  3. I don't use xorg.conf simply a standalone nvidia.conf which points to the BusID of the video card (I have on on the mb). I got further today, and got a blank screen with the (rather large) mouse on it. By looking at the zero outstanding issues you have I highly suspect myself to be the real culprit. The only thing I can think of now (after many attempts making the exact sequence of earlier tests suspect) there is one change I'm likely to have done namely getting the above blank screen during the install and not simply (as I did today) use Ctrl Alt F1 which returned me to the installer. It continued a logical path of prompting questions until done. I did not accept the early reboot offer after installing pkgs but let it continue. At the end I did however reboot (probably in the script). When it came back up it switched to X w a blank screen and the mouse. /var/log/messages reveals that sddm-greeter exited on signal 6 xorg.log reveals an error on open /dev/dri/card0 no such file or directory. So that's the same error I got above.
pkgset commented 3 years ago

OK, I moved two files that were created by the script which is for swcursor and tap from /etc/X11 to the /usr/local/... xorg.conf.d dir and numbered them to load after nvidia and it came up. I had not added the BusID yet so that was part of the change. I'm going to test by moving the two other files back to /etc and see which change made it....

OK, so the load order made the difference. Testing if BusID ultimately made a difference...

Yes, my built-in video ("disabled" in BIOS) is of course found so BusID is mandatory.

One thing I would recommend is to (color?) highlight the note about using Ctrl-Alt-F1 to return to the script as I'm really suspicious about still be reading everything at that point. I can easily see missing that point and just reboot out of X. It is obviously a very thorough script but it has a lot of notes, of course no excuse if one does not read them all, but in the name of getting the highest success rate it might be a good idea. Once I'm caught up with my work backlog I'll create a version and submit it here. If I can figure out how I'll do a test for dual cards, at least notes of my system that you could use to beef it up.

Otherwise thanks for what must have been quite the chore to put together, and being something so needed it's especially valuable! :)

pkgset commented 3 years ago

I sang too soon... I got the login screen but it does not go past it. /var/log/messages reveals a warning that $hald is not set properly, after that there are 18 invalid path statements where QDBusConnection says:

invalid path /org/freedesktop/UDisk2/block_devices/pool_snap_ROOT_12.0-CURRENT-201610220228 invalid path /org/freedesktop/UDisk2/drives/pool_snap_ROOT_12.0-CURRENT-201610220228 Could not emit signal org.freedesktop.DBus.ObjectManager.InterfacesAdded: Marshalling failed: Invalid object path passes in argument

These are repeated 6 times ending with devd: notify clients: send() failed; dropping unresponsive client

Xorg.0.log reveals no errors and only a warning VGA arbiter: cannot open kernel arbiter, no multi-card support

Meanwhile it sees all three monitors and does present the login screen, which of course is how far it gets.

Looks like UDisk2 is a system bus and could not reasonably fail. I'll do another cold start... no change.

Snap is actually my ZFS location so I've removed the pool (my home is on a separate pair of mirrored drives). But it did not help, adding what should be a not needed hald to rc.conf. That made no difference except one less notice.

The snap error, I believe, is actually ZFS related and should not be relevant. The only dataset with active canmount is /usr/home on the home pool.

outpaddling commented 3 years ago

The hald message seems suspicious to me, assuming you're still using plasma. I was involved in a thread with the KDE team last year about deprecating HAL.

Have you tried another desktop env like Lumina or XFCE? It that works, you can rule out driver problems.

pkgset commented 3 years ago

Yes I have. And as the final (edited) notes above shows it made little difference. Driver does not make sense since all three monitors comes up, at least talking about nvidia. (When I say login I'm referring to the plasma login screen.)

I will try another DE to see if anything changes. (BTW using default DEs (MATE) works under nomadbsd and ghostbsd.)

After each login attempt devd says it is dropping an unresponsive client.

EDIT: I'm suspecting the client it talks about is a monitor. I have three:

  1. 1080p
  2. 4K
  3. 1080p The 4K dies "easily" when I change to vt.

Now I installed MATE but it would not come up probably because I answered No to the xinit question. Testing Cinnamon w GDM now, we'll see if that's any better. Not really. But at least it throws up a window saying Oh no! Something has gone wrong. messages says:

console-kit-daemon WARNING: kvm_getenvv failed console-kit-daemon WARNING: Error waiting for native console 1 activation: Inappropriate ioctl for device and then the same for console 9.

Ah, I'm using the Linux lib from Centos 7, I've not tried 6. I'm going to do a fresh FBSD install to ensure it's clean and then choose some other DE, that seems more likely to reveal something different.

pkgset commented 3 years ago

Mate came up but with default resolution 1024x768 on my 4K monitor :) I'm wondering if these have Wayland? I just saw a notice about not mixing Wayland w nVidia, but no details. I'm going to install a Linux distro and see what happens. Things are not very consistent as far as video goes. I've swapped video ports and cables, don't think I have a card to swap with so I'll just install Garuda Linux and we'll see what comes out... If that works I'll be chasing Wayland to see what the deal is. (I think I saw it mentioned during install.)

pkgset commented 3 years ago

OK, one of the interesting points with Garuda is that it lets you chose to install with nVidia's proprietary drivers or the OpenSource one simply by selecting one of two lines. Makes it for a simple way to test. And it's fast! Their current version have turned Plasma into a work of art that feels very MAC like.

Anyway, back to getting desktop-installer to work. At this point it's quite clear its been working all along. My video card now don't like the proprietary driver that is all it's ever known. It's failure is new to me, I'd expected it to simply die, not this kind of behavior of on and off. At this point it does not matter which port I use, or monitor, it simply will not load their own drivers. Having now tested the drivers from different sources it's clear the card has buggered out the accelerated part. Sorry for having wasted people's time, maybe it ends up useful for someone else... :)

Note, because this kind of thing will be asked even though it's already noted above, this card a Gigabyte GTX 1050Ti has always and only worked with nVidia's drivers. Not that it would not work with the OpenSource, this card simply have never had the honor. And we are talking about the current 400.100 version. I did try the previous version just to having done it though I already knew better.

pkgset commented 3 years ago

Solved, card is failing. Which is a bad time to do with the 4x pricing on cards right now.

outpaddling commented 3 years ago

Never hurts to have a discussion and leave some clues for posterity. I suspect you have, but not to assume, have you tried older versions of nvidia-driver? If you're sure the drivers are the issue, you might want to open a FreeBSD PR regarding the nVidia driver. In my experience, they do tend to get solved.

pkgset commented 3 years ago

:) Yes, I described it above... :) The card has gone bad. :( The accelerated part is gone.