Questions about the setup

1m-N00b commented 3 years ago

I have tried the install script in virtualbox, but it does not work. When I decrypt sda like this (pw & name is "v" for easy test), I get When I decrypt sda, it says your name as v(pw)login:maximbaz(automatic login) like that and I can't login.

I did a quick search and found that the cause of the error was that I hadn't reflected it in /etc/shadow, but I wasn't sure.

Also, I was reading install.sh, but I can't find anything about boot, such as grub-install or bootctl. What bootloader do you use, Also, if you are using encrypted disks, I think you need to set the uuid, where do you do that? I'm sorry for asking a question that has nothing to do with the purpose of the error.

maximbaz commented 3 years ago

I admit install.sh is most often to break because although I'm trying to keep it up-to-date, I'm really using it once in a few years when I install on a new laptop 😅

Happy to fix some issues though and clarify the uncertainties! Please always feel free to ask more questions 😉

Let's start from easy ones:

I was reading install.sh, but I can't find anything about boot, such as grub-install or bootctl. What bootloader do you use?

None! That's the beauty of modern UEFI, I'm using direct UEFI boot + secure boot.

Specifically, I keep that code in arch-secure-boot and reference it in install.sh here:

https://github.com/maximbaz/dotfiles/blob/11884025e296f02875f310213816a6a6713acc44/install.sh#L212

I like that there is minimal amount of code used to boot my system, less code (probably thousands lines less!) means less bugs and security issues 😁

Have a look for inspiration, there are also more feature-rich projects such as https://github.com/gdamjan/secure-boot or especially https://github.com/andreyv/sbupdate, and also do check out this recent project: https://github.com/Foxboron/sbctl

Also, if you are using encrypted disks, I think you need to set the uuid, where do you do that?

Let me know if I misunderstand the question, here's what I think you are looking for:

cmdline references the encrypted disk by PARTLABEL and the disk with LUKS header by label:

https://github.com/maximbaz/dotfiles/blob/11884025e296f02875f310213816a6a6713acc44/install.sh#L196

PARTLABEL is set here:

https://github.com/maximbaz/dotfiles/blob/11884025e296f02875f310213816a6a6713acc44/install.sh#L115-L116

and luks LABEL is set here:

https://github.com/maximbaz/dotfiles/blob/11884025e296f02875f310213816a6a6713acc44/install.sh#L130

When I decrypt sda, it says your name as v(pw)login:maximbaz(automatic login) like that and I can't login.

Just to clarify, disk decryption works fine, and the first time you are getting an issue is when the system boots and tries to autologin, correct?

I think the issue is that although the install.sh does not really hardcode my name specifically (it asks for username, etc), in the end it does execute my dotfiles:

https://github.com/maximbaz/dotfiles/blob/11884025e296f02875f310213816a6a6713acc44/install.sh#L236-L241

Which in turn unfortunately have my name hardcoded in a few places here and there. Specifically, the autologin is handled by this service override, which as you can see has my username...

https://github.com/maximbaz/dotfiles/blob/11884025e296f02875f310213816a6a6713acc44/etc/systemd/system/getty%40tty1.service.d/override.conf#L3

I don't know, maybe the easiest way for you to try this out in virtualbox would either be to use maximbaz as username, or clone my dotfiles and replace all occurrences of maximbaz with your username, and adjust install.sh to clone your fork instead.

1m-N00b commented 3 years ago

Thank you for answering my questions one by one. I tried it with maximbaz, but after autogin, I got blackscreen and could not proceed.

1m-N00b commented 3 years ago

When I pressed winkey+enter on the black screen, the terminal opened and I was able to type waybar to start waybar. P.S)I read the waybar config and noticed that it was assigned to $hyper.

However, my knowledge is limited and I don't understand much about it. Is autologin an alternative to dispay manager like lightdm?

I use a shell script called launch.sh in i3 to automatically launch polybar (exec_always ...). ), but dont you start automatically in your environment? Also, using pacman -Syu does not work because of the custom repo (idk)? What should I do about this? When installing with laptop, should I do setup-systems.sh and setup-user.sh afterwards and skip the custom repo settings? However, I am also concerned that I may be missing some packages that I need.

Please forgive me for asking so many questions again.

maximbaz commented 3 years ago

When I pressed winkey+enter on the black screen, the terminal opened and I was able to type waybar to start waybar.

Awesome, you are further in the journey! 🙂

Is autologin an alternative to dispay manager like lightdm?

Much simpler, autologin is nothing more than automatic login to a TTY 🙂 Try to press Ctrl+Alt+F5 for example, you would go to tty5 and you will see a prompt for username and password. If you type username and password and press Enter - that's all what autologin does!

Once you are logged in on a tty, a default shell is started - zsh in our case:

https://github.com/maximbaz/dotfiles/blob/c4ca621c59b8bb1a45bc4d7f516e75f89bec50ed/install.sh#L224

When zsh starts, it in turn runs .zprofile script automatically, which has the following block:

https://github.com/maximbaz/dotfiles/blob/c4ca621c59b8bb1a45bc4d7f516e75f89bec50ed/.zprofile#L11-L12

This is basically saying "if you are on tty1, start sway process and log its output to systemd journal, the second line can be replaced with simply sway for simplicity.

So that's why you have sway started, and that's why if you switch to tty5 sway will not be started.

I use a shell script called launch.sh in i3 to automatically launch polybar (exec_always ...). ), but dont you start automatically in your environment?

I do, but it is managed by systemd. Let me show you 🙂

When sway finishes loading configs, the last step it does is it triggers systemd activation:

https://github.com/maximbaz/dotfiles/blob/c4ca621c59b8bb1a45bc4d7f516e75f89bec50ed/.config/sway/config#L300-L302

sway-session.target is a special file that tells systemd that a graphical session has been started. This can signal all apps that only work in graphical session (anything that has a UI basically, like waybar) that they need to start up as well - here's how waybar marks itself as an app that should be auto started together with graphical session: link

You will see that I do this in other graphical apps, e.g. here:

https://github.com/maximbaz/dotfiles/blob/c4ca621c59b8bb1a45bc4d7f516e75f89bec50ed/.config/systemd/user/safeeyes.service#L3-L4

Of course you must enable the service for it to be autostarted by systemd, like so:

https://github.com/maximbaz/dotfiles/blob/c4ca621c59b8bb1a45bc4d7f516e75f89bec50ed/setup-user.sh#L140

So that's the flow, autologin starts zsh, zsh runs .zprofile, .zprofile starts sway, sway signals to system that a graphical session has been started, and systemd autostarts all the UI apps. Hope that helps understand the flow, and gives you some pointers to debug why waybar is not autostarting for you? Good idea would be to check systemctl --user status waybar for example for logs.

Also, using pacman -Syu does not work because of the custom repo (idk)? What should I do about this?

Just run repo-add -s /mnt/var/cache/pacman/maximbaz-local/maximbaz-local.db.tar to create a signature. Hmmm we run it without -s in install.sh because at that time gpg is not configured yet, but it's a nice finding, maybe I should do this as part of setup-* scripts, thanks.

When installing with laptop, should I do setup-systems.sh and setup-user.sh afterwards and skip the custom repo settings?

No no, I think what you did is fine, hopefully after you add a signature it will start working.

1m-N00b commented 3 years ago

Thanks to the detailed explanation, I found the cause. After installation, the waybar was inactive, probably because I did not run seyup-system.sh with sudo.

because at that time gpg is not configured yet

Does this mean that I have to sign myself into /mnt/var/cache/pacman/maximbaz-local/maximbaz-local.db.sig with gpg?

maximbaz commented 3 years ago

Does this mean that I have to sign myself into /mnt/var/cache/pacman/maximbaz-local/maximbaz-local.db.sig with gpg?

To clarify, this repo is only needed if you are planning to use aurutils. If not, just open /etc/pacman.conf and remove [maximbaz-local] block altogether 🙂

If you do want to keep it, you can change in /etc/pacman.conf SigLevel = Required to e.g. Optional TrustAll and I guess that will also fix the error, without having to sign anything.

Or finally, if you want to explore signature, you must configure gpg and then execute repo-add -s /var/cache/pacman/maximbaz-local/maximbaz-local.db.tar - this should simply sign the file with gpg.

1m-N00b commented 3 years ago

I see. I'm using something simple like yay or paru, not something for expert like aurutils, so I'm sorry for asking such a curious question with little understanding of the background.

Thank you for answering so many questions so carefully. My own questions have been answered, so you may close them.

maximbaz commented 3 years ago

Cool, if you find some more questions, do not hesitate to write again, in this thread or in a new one 👍

1m-N00b commented 3 years ago

Hello. It worked fine in virtualbox, but when I tried it on my laptop I'm getting an error in the detached luks header like this

I thought it was something to do with the luks header as shown in the output. I checked the install.sh and Archwiki(JP&EN) and arch-secure-boot's initial setup, but I could not find the cause.

I also tried to explore install.sh and dotfile more. I thought it might be this part, but I also checked the luksHeaderBackup etc. in the cryptsetup man page, and it is the header of the backup, so I thought there was no problem. Also, I found an unfamiliar hook (encrypt-dh) in install.sh, which seems to be related to detached luks header, but maybe it's your original tool? I could find only information about it. Since the error is displayed by the "else" in the relevant part I think it's probably caused by encrypt-dh.

I'm sorry I'm too much of a noob to solve this myself.

maximbaz commented 3 years ago

I found an unfamiliar hook (encrypt-dh) in install.sh, which seems to be related to detached luks header, but maybe it's your original tool?

It's my AUR package created after this bug.

Installation script most probably installed it, because it installs maximbaz metapackage and it in turn pulls this as dependency:

https://github.com/maximbaz/dotfiles/blob/1f7c9cc86ab98248a1c2c41f7510d9a49b9db7ae/packages/PKGBUILD#L44

During the installation, did you select the same disk to store LUKS header, or did you actually select a different disk, i.e. a detached header approach? Was it the same selection you did in virtualbox?

This error you see seems to indicate that it cannot find a disk where LUKS header is written 🤔

1m-N00b commented 3 years ago

Yes, it's all the same. I've also changed the name to maximbaz to change it after installation.

Also, when I check with cfdisk from archiso, nvme01p1 and nvme01p2 are created in nvme01 which I selected as the installation destination.

maximbaz commented 3 years ago

Hmm 🤔 Could you try to show me the output of this: $ lsblk -o name,mountpoint,label,size,uuid

In particular I'm wondering where the btrfs label is.

Also show the contents of etc/kernel/cmdline generated here, I wonder if the size is generated correctly?

1m-N00b commented 3 years ago

I couldn't look back at the output on the terminal on laptop, so I installed it again from my desktop via ssh and checked it.

$ lsblk -o name,mountpoint,label,size,uuid output:

NAME        MOUNTPOINT                LABEL         SIZE UUID
loop0       /run/archiso/sfs/airootfs             602.6M
sda                                                14.8G
└─sda1      /run/archiso/bootmnt      ARCH_202103  14.8G E2C0-7A22
nvme0n1                                           476.9G
├─nvme0n1p1                           luks        476.4G ca2df0bd-3a0b-4e98-8586-114739fd51d9
│ └─luks                              btrfs       476.4G 584c4449-9126-4ed3-bf2d-26fb00179f3c
└─nvme0n1p2                           EFI         550.3M 0E9F-1B68

$ vi etc/kernel/cmdline output:

cryptdevice=PARTLABEL=primary:luks:allow-discards cryptheader=LABEL=luks:0:16777216 root=LABEL=btrfs rw rootflags=subvol=root quiet mem_sleep_default=deep

maximbaz commented 3 years ago

On a first glance, everything looks quite good 🤔

I suspect, a simple workaround is to remove the entire cryptheader=LABEL=luks:0:16777216 block from your cmdline and then it should boot just fine (if you do this not by modifying install.sh, but after the installation, make sure to regenerate efi by calling arch-secure-boot generate-efi).

However, the real problem seems to be somewhere in the patch, so if you want we can try to debug this further.

Initially, cryptheader argument gets parsed like so in three components split by : sign:

https://github.com/maximbaz/pkgbuilds/blob/549af20a7a17bc5b9f75826e62e802d6a2466e92/mkinitcpio-encrypt-detached-header/support-detached-header.patch#L12

Then the first component gets resolved to a device:

https://github.com/maximbaz/pkgbuilds/blob/549af20a7a17bc5b9f75826e62e802d6a2466e92/mkinitcpio-encrypt-detached-header/support-detached-header.patch#L18

In your case, LABEL=luks should resolve to /dev/nvme0n1p1.

Then because the second component contains a number, execution goes here and using dd we basically copy the first few bytes from the partition into a file (so from position 0 to position 16777216 in your case):

https://github.com/maximbaz/pkgbuilds/blob/549af20a7a17bc5b9f75826e62e802d6a2466e92/mkinitcpio-encrypt-detached-header/support-detached-header.patch#L31

And then if that succeeded, we would continue, but your execution goes here instead:

https://github.com/maximbaz/pkgbuilds/blob/549af20a7a17bc5b9f75826e62e802d6a2466e92/mkinitcpio-encrypt-detached-header/support-detached-header.patch#L39

What I would suggest to test is the following: does this command (also used in install.sh):

cryptsetup luksHeaderBackup "/dev/nvme0n1p1" --header-backup-file /tmp/header1

Produce exactly the same file as this command:

dd if="/dev/nvme0n1p1" of="/tmp/header2" bs=1 skip="0" count="16777216"

I expect the answer to be "yes", but if you are getting two different files, then that is the cause of the issue, and we can then debug why it's not the case...

1m-N00b commented 3 years ago

I tried to run arch-secure-boot generate-efiagain, but it did not work. I also generated header1 and header2 and checked their sizes, and they are the same.

I thought it might be the BIOS (UEFI) settings again, so I took a look. I found "secureboot" and "expert key management". In case you're wondering, the laptop I'm using is a dell xps13 (9310).

maximbaz commented 3 years ago

just to be absolutely sure, could you please run sha256sums /tmp/header* on them?

I don't think it's secure boot because it would not allow you to boot much sooner, I think really it's some kind of problem with this detached luks header in this special case when it's not actually detached 🙃

1m-N00b commented 3 years ago

It was still the same.

It's already 2:00am (midnight) in Japan, so I'm going to bed. I apologize for the inconvenience.

PS)I will try it again tomorrow (actually today) under a virtual environment and compare the logs.

maximbaz commented 3 years ago

One suggestion I have is to try to add some debugging info to the hook, so that we can see exactly how you get to the "detached header could not be open":

diff --git a/encrypt-dh b/encrypt-dh-debug
index 35e6040..27e5494 100644
--- a/encrypt-dh
+++ b/encrypt-dh-debug
@@ -40,6 +40,7 @@ EOF
 $cryptheader
 EOF

+        echo "Found header param: dev=$chdev arg1=$charg1 arg2=$charg2"
         if [ "$chdev" = "rootfs" ]; then
             cheaderfile=$charg1
         elif resolved=$(resolve_device "${chdev}" ${rootdelay}); then
@@ -53,13 +54,18 @@ EOF
                     umount /cheader
                     ;;
                 *)
+                    echo "Reading header using dd"
                     # Read raw data from the block device
                     # charg1 is numeric: charg1=offset, charg2=length
                     dd if="$resolved" of="$cheaderfile" bs=1 skip="$charg1" count="$charg2" >/dev/null 2>&1
+                    echo "dd exit code: $?"
                     ;;
             esac
         fi

+        ls -al "$cheaderfile"
+        sha256sum "$cheaderfile"
+        stat "$cheaderfile"
         if [ -f ${cheaderfile} ]; then
             cryptargs="${cryptargs} --header ${cheaderfile}"
         else

Remember to run mkinitcpio -p linux (maybe inside arch-chroot if you are doing this during installation) after modifying /usr/lib/initcpio/hooks/encrypt-dh for this to take effect.

Then you would hopefully see some extra info about the error...

However, after looking more closely to the code of encrypt-dh, what really concerns me is that this error really should not be critical, if you are getting "detached header could not be open" it simply goes to the code path of regular encrypt hook, and tries to decrypt LUKS as if the header is not detached - which is exactly your case! So it should have worked! But since it doesn't, I fear you are somehow getting a corrupted LUKS, maybe.... 🤔

This is really really strange, and so I just want to say that I appreciate your help debugging this issue.

1m-N00b commented 3 years ago

I changed /usr/lib/initcpio/hooks/encrypt-dhand tried mkinitcpio -p linuxunder arch-chroot, but nothing was output. After rebooting, the same thing happened. Here is the result of running the program after the change.

I also tried to install in virtualbox again and compared the output log, but it was the same as on laptop and there was no particular difference (error).

PS1)I'm not very familiar with low layers, so I'm not sure if this is the right way to do it, but I'll initialize it with nvme format /dev/nvme0 -s 1 -n 1 and try again.

PS2)I tried to reinstall using the above method, but it did not work.

maximbaz commented 3 years ago

Very strange! Just out of curiosity, if you have a spare USB stick or sd card that you don't mind formatting, can you plug it in during installation and select it as a storage for LUKS header? Just wondering if the problem is specifically with that code path when you don't detach header, or it's a bigger problem...

1m-N00b commented 3 years ago

I'd like to try it, but it only comes with one usb type c to 3 conversion adapter, and I'm using that for archiso, so I can't currently do it There is an SD port, but no available SD card, so I don't think I can try it right now.sorry

maximbaz commented 3 years ago

aah the joy of xps 13, mine at least have two usb-c ports 😅

no problem - I'll also try to find other ways to reproduce this, it's so weird that it doesn't happen in virtual machine, and I don't have a spare laptop to attempt installation...

1m-N00b commented 3 years ago

For now, I'm going to use your install.sh as a reference and just skip the detached luks header part.

Also, the destination of the luks header for /dev/nvme0n1p1(part_root) is /dev/nvme0n1p2(part_boot), right?

maximbaz commented 3 years ago

Good idea - no, it's p1, "detached header" means header is on a different drive/partition (so if you use sd card, computer will not boot without it), and regular luks means that header is located on the exact same partition that you are encrypting, which is part-root in your case

so just skip the --header flag altogether, as in the "else" code branch...

1m-N00b commented 3 years ago

I see. So it's not a detached luks header implementation, but a regular luks one.

And one last question. I've been looking at install.sh, and I don't see encrypt2 in the hook, does that mean that encrypt-dh is used instead? I often write encrypt2 in my hooks, so...

maximbaz commented 3 years ago

Correct, a regular luks one!

encrypt-dh is a patched version of encrypt hook, can't quickly find what encrypt2 hook is, but usually yes people use encrypt or sd-encrypt as per wiki

One last idea, before you move on to reinstalling system once again, is when you are dropped in the emergency shell like so, you can actually try to explore in that shell, and try to debug the hook. For example, I imagine you can modify vi hooks/encrypt-dh from there and maybe even just execute that script directly from the shell. I'm trying that out in parallel, seems like it could help figure that out.

it seems you cannot just call resolve-device function from shell, but you can do ls -al /dev/disk/by-label and ls -al /dev/disk/by-partlabel to see if disks are present: you should expect that:

by-label contains luks -> nvme0n1p1, and by-partlabel contains primary -> nvme0n1p1.

if that's the case, then try do execute the dd command by hand, i.e. dd if="/dev/nvme0n1p1" of="/tmp/header" bs=1 skip="0" count="16777216" and then check the file /tmp/header, and if even that succeeds and /tmp/header looks alright, then try to decrypt luks manually by specifying --header /tmp/header.

1m-N00b commented 3 years ago

I'm sorry, I got it mixed up with lvm2.

And here's the good news. I checked it as you said and it was not recognized. I was wondering why, so I unplugged the usb(archiso) and turned on secureboot in bios, and it worked.

Thank you very much for so much your advice.

maximbaz commented 3 years ago

Wow, so the issue was caused by plugged USB stick, and now the issue is all solved? Happy to hear this!

1m-N00b commented 3 years ago

I think so, probably. But I'm sorry I can't tell you the exact debugging. YES i soloved issue and questions Thank you once again.ill close it.

1m-N00b commented 2 years ago

hi maximbaz :) I was so fascinated by your dotfiles that I rewrote them for myself and use them to this day. I've been following the updates to your dotfiles, and if necessary, I've been understanding the changes and applying them to my own dotfiles, but I have a question that I couldn't resolve on my own because of the change.

Questions: 1.Why does it open in cgroup?

Here are some conclusions I've come up with in my research. When neomutt is opened in kitty, the process name of the kitty that opened neomutt will be "kitty" not to distinguish it from the terminal where you are coding.Therefore, we use cgroup to distinguish between them.(neomutt is an example of a cui app)

2.I can't understand the details of the last line of cglaunch when I look at man.(cgkill as well.)

I read the systemd-run man page, but due to my limited knowledge of systemd, I could not understand the options. --userand --unit are somewhat understandable.I'm also guessing that this line is probably being started separately by cgroup.

maximbaz commented 2 years ago

Hello 🙂 Glad you find this useful!

To distinguish neomutt from any other kitty, we actually use kitty --class neomutt neomutt, kitty knows to use the value of --class as wayland's app_id, which can then be matched to apply modifications such as icon, make floating, resize, etc - like this:

https://github.com/maximbaz/dotfiles/blob/d5b49d631774e875619dfe0b20b8a64be01a2996/.config/sway/config#L54-L57

cgroups has different purpose - it's to group processes (😜). Have you ever seen when you launch chromium, that many processes are started, and then if you look in top it's not even clear which process is running because of what, who launched it and why... Furthermore, suppose chromium got stuck and you want to kill it fully, you can kill some of its processes and window will disappear, but are you sure that you killed all of it, or there is still some useless leftover running?

It was this which got me interested, when things are neatly organized in cgroups, you can use systemd-cgls to look into the process tree and clearly see which process belongs to what, and also you can use cgkill to just kill say chromium, fully.

With terminal in particular, there is an "extra" feature - suppose you open terminal, and launch some things in background, when you close terminal window you want all child processes to be closed too - this is what cglaunch --term achieves, cgroup will be closed when the main process (terminal) is closed.

In practice, cgroups have many more features which I didn't explore myself yet, you can for example limit resources, to prioritize one app and deprioritize another, or prevent an app from using too much RAM, or things like this...

The last line of cglaunch indeed starts a new cgroup, as you correctly guessed. --slice is parent, run systemd-cgls and it will become clear I think. --quiet just removes some unnecessary output, --no-block starts the group a bit faster as we don't wait for confirmations, and --unit gives a name to the group, which makes it easier to search for, and also to sort by time (so you can find the group which was launched most recently).

wldash, a new app launcher that replaces wofi for me, can be taught to use cglaunch, which means that all apps are launched in cgroups, very neat 🙂

Hope that clarifies some things? Ask if I missed something!

By the way, cglaunch and cgkill are based on WhyNotHugo's dotfiles: https://git.sr.ht/~whynothugo/dotfiles/tree/main/item/home/.local/bin

1m-N00b commented 2 years ago

Thank you for the quick reply and for the concrete example and explanation.

Furthermore, suppose chromium got stuck and you want to kill it fully, you can kill some of its processes and window will disappear, but are you sure that you killed all of it, or there is still some useless leftover running?

Does this mean that kill or killall may not be able to completely erase the process?

With terminal in particular, there is an "extra" feature - suppose you open terminal, and launch some things in background, when you close terminal window you want all child processes to be closed too - this is what cglaunch --term achieves, cgroup will be closed when the main process (terminal) is closed.

Does this mean that if I run waybar in a terminal(kitty) started with cglaunch, it will continue to run behind the scenes even if I close the terminal with sway? Or does it mean that closing nvim will terminate all plugin processes running in the background?(Maybe nvim will close regardless of cgroup...)

maximbaz commented 2 years ago

Apologies, I forgot to reply on time!

Does this mean that kill or killall may not be able to completely erase the process?

kill and killall do kill the process by ID and name respectively, but not necessarily the children processes. I am not 100% certain on the theory behind, but I can give you at least some ideas for experiments you can perform to see what's going on...

Try to open kitty, and inside open htop. Now kill that kitty (either from another kitty using kill , or just using sway's shortcult for closing apps, Win+Q for me). Now you don't see htop, but try pgrep htop - and you'll see that it might be still running!

Another observable experiment is to play a video from terminal (mpv file.mpv) and then closing that terminal - whether video continues to play or gets closed.

Also try using xdg-open <something>¸ and see what happens with a newly opened app after you close the terminal.

In some cases (I can't recall right away the precise example) I know that app was supposed to shutdown children processes (and usually did so correctly), but when that app got stuck and had to be killed manually, then all the children processes kept running.

So cgroups is like a guard in this case, you can kill the group, and all processes within the group will be killed. Try to repeat the experiment with htop above, but launching kitty using cglaunch --term - now htop will in fact be closed, when kitty was closed using Win+Q.

Another useful thing to note: all systemd services do in fact create their own cgroup. Assuming you run waybar as a systemd service like me, run statusu waybar and observe the output:

     CGroup: /user.slice/user-1000.slice/user@1000.service/app.slice/waybar.service
             ├─   2161 /usr/bin/waybar
             ├─   2181 /bin/sh /home/maximbaz/.local/bin//waybar-usbguard
             ├─   2185 /bin/sh /home/maximbaz/.local/bin//waybar-eyes
.....

In the past I didn't run waybar as systemd, and so if for some reason waybar was closed (killed, or if I exited sway), then those children processes would still be running. At some point I implemented some custom bash code to ensure that they would be properly closed. But with systemd / cgroups all of that is not necessary, as soon as systemd service is stopped, the entire cgroup is killed, and so all those children processes are cleaned up.

Does this mean that if I run waybar in a terminal(kitty) started with cglaunch, it will continue to run behind the scenes even if I close the terminal with sway?

Now you can see it's the opposite, if you open kitty with cglaunch --term and launch anything there, as soon as kitty is closed, everything that you launched will be closed too.

Granted, if you want to achieve the opposite (process that would survive closing the terminal), I guess that's not currently possible 😅

Or does it mean that closing nvim will terminate all plugin processes running in the background?(Maybe nvim will close regardless of cgroup...)

Let's assume you started cglaunch --term and then ran nvim, and then exited nvim. I don't know if nvim cleans up its own plugins, so maybe nvim closes them by itself. But lets assume they are still running. Then when you finally close the terminal, all of them will be killed.

Hope that makes sense? And happy holidays 😉

1m-N00b commented 2 years ago

I'm sorry for the delay in replying. I followed your example, and when I closed it with sway, the process remained. However, in kill and killall, htop was closed and kitty remained.Should I have deleted the parent process, kitty, and tried this?

I could generally understand what I wanted to achieve and the benefit of the advantages. Thank you for the explanation, which is easy to understand even for a noob like me.

maximbaz commented 2 years ago

Yes, I think you tried "killall htop" but should have tried "killall kitty" and then check if htop is still running.

All in all, yes just try to observe the behavior in daily usage and see if you like it, at this point this script was added mostly for fun and to experiment with cgroups, there is little "real" advantage I would say...

1m-N00b commented 2 years ago

i see One last question came to mind. Does killing with sway erase the cglaunched app?Or should I run cgkill with kitty when the app gets heavy, as in the example above?

The explanation may not be well translated into English. Why isn't sway kill a cgkill, even though launch is all about cglaunch?

maximbaz commented 2 years ago

Good question - don't think of them as the opposites, if everything goes well, I don't think we will ever have to resort to cgkill. At least that's my current understanding... sway's kill command sends a "graceful shutdown" signal to the visible window, which is likely the main process, which is likely to properly close all child processes, if any. cgkill sends the signal to all processes at once - who knows, maybe this will disrupt some state synchronization or saving mechanism or something...

So to answer "Does killing with sway erase the cglaunched app?" is this:

With well-behaved GUI programs, I suspect the answer is yes, e.g. chromium knows to close its children processes
waybar for some reason didn't close plugins (our custom shell script), I don't know if that was a bug, but in any case we solved that by running waybar in a systemd service
I don't know of any other app that leaves some processes hanging after being closed
Terminal is a special case, that's why we have cglaunch --term, to enforce the answer "yes" to your question

I'm sure as time will go we will learn more and the answer will be adjusted 😁

1m-N00b commented 2 years ago

Well, that generally cleared up my doubts. I'm going to try to introduce it myself and understand it systematically. Thank you very much :+1:

1m-N00b commented 2 years ago

Sorry for the repeated asked. I'm inexperienced and it takes me a long time to solve my own problems, so I always do it in a virtual environment (virtualbox) before trying it in a real environment.

I've installed it based on what I've heard before and this issues. I'm getting a segfault error in getty. At this point, I can't go on with getty, so I press win+enter and get an error message.

So I did a journalctlon getty and sway to check the cause of the error, and it seems that sway is segfault. I also tried to output using the trouble shooting recommended by sway, but it crashed and I couldn't export it.

I have a feeling that it's probably not dotfiles causing the problem, but the sway side. I would appreciate any advice you can give me. I will also report to sway if the advice does not solve the problem.

maximbaz commented 2 years ago

Hmm that is a bit difficult to guide... Could you clarify, it doesn't happen in virtualbox, or the opposite, it did happen in virtualbox?

Maybe remove ~/.zshenv and ~/.zshrc and relogin in a fresh TTY and just manually try to execute $ sway, to eliminate as much "magic" as possible, this will start sway in a clean zsh environment...

If it still crashes, try coredumpctl tool to see if you can extract any relevant data...

If it is happening in virtualbox, I think there is a whole different level of complexity, maybe with some graphics virtualized hardware or drivers or whatnot...

By the way (maybe as another exercise), I recommend you looking into KVM and specifically virt-manager, I also used virtualbox, but I found virt-manager is so much more pleasant!

1m-N00b commented 2 years ago

I'm sorry I didn't communicate in English correctly. Thank you for the advice.

Hmm that is a bit difficult to guide... Could you clarify, it doesn't happen in virtualbox, or the opposite, it did happen in virtualbox?

yes It happened on virtualbox.my laptop(installed when I asked the question.) has no problem.

Maybe remove ~/.zshenv and ~/.zshrc and relogin in a fresh TTY and just manually try to execute $ sway, to eliminate as much "magic" as possible, this will start sway in a clean zsh environment...

When I tried it, it crashed (froze) after running % sway.

If it still crashes, try coredumpctl tool to see if you can extract any relevant data...

imgur:error

If it is happening in virtualbox, I think there is a whole different level of complexity, maybe with some graphics virtualized hardware or drivers or whatnot... By the way (maybe as another exercise), I recommend you looking into KVM and specifically virt-manager, I also used virtualbox, but I found virt-manager is so much more pleasant!

I'll try it later

maximbaz commented 2 years ago

There are subcommands to explore the dump (coredumpctl info or coredumpctl dump) but in any case I place my bet on the issue being specific to your VM, either bad config, not enough GPU memory, missing drivers or something like that... Try in non-virtualbox just out of curiosity but otherwise I'm not sure what to do, except to try on real hardware 😅

1m-N00b commented 2 years ago

Thank you for your advice. I'll look into it and try another vm.

1m-N00b commented 2 years ago

related similar to #48 Excuse my repeated questions. I don't understand the details of zsh(z4h) very well, so I apologize if my question is off the mark. I can't find functions term-title and direnv in z4h in the fn folder, are they set-term-title and direnv-hook or -init with similar names? There was a small description of term-title in note.md, but I couldn't understand it because there was no explanation. Q1:Are these functions defined in private-zshrc instead of z4h? Q2:What is it doing?

maximbaz commented 2 years ago

Private zshrc is essentially empty for me, these functions come from z4h - I haven't studied z4h code precisely to locate them, but would suppose you are right it's fn/-z4h-direnv-hook and -z4h-set-term-title-preexec in fn/-z4h-init-zle.

term-title sets the title of the title of the terminal to the $PWD or currently running command:

direnv is an integration with https://github.com/direnv/direnv/, basically allows you to have .zshrc for a specific subfolder, that is loaded and unloaded when you enter and leave the folder - useful to configure some projects, to configure via env vars.

1m-N00b commented 2 years ago

Thank you for your kind response to my repeated questions. I ask because when I asked romkatv before, I was told that there is no such function. I will visit romkatv again!

1m-N00b commented 2 years ago

It seems that libinput-gestures is unable to resolve the dependency in the maximbaz repo. I think the solution is to download PKGBUILD from aur and build it yourself first and repo-add it. Is it correct? I posted it #48 as well, but I will share the information here as well,Because I shared this in an inappropriate place so you may have missed it. eror

maximbaz commented 2 years ago

sorry for the trouble, the reason is that I removed libinput-gestures locally but didn't properly cleanup repo, but there is a background cleanup task which executed partial cleanup and upload, so you got the repo with inconsistent state. I'll get it properly deleted so you can build it from AUR yourself, as soon as I get some chance to do it!

1m-N00b commented 2 years ago

Sorry for taking the trouble to do this. Good luck.

maximbaz commented 2 years ago

no no, totally my bad, I know I should start breaking my repo less 🙈

maximbaz / dotfiles

Questions about the setup #39