MichaIng / DietPi

Lightweight justice for your single-board computer!
https://dietpi.com/
GNU General Public License v2.0
4.89k stars 498 forks source link

VM's | getcwd errors, due to /tmp/$G_PROGRAM_NAME removals #2237

Closed Kreeblah closed 5 years ago

Kreeblah commented 6 years ago

Creating a bug report/issue:

Required Information:

Additional Information (if applicable):

Steps to reproduce:

  1. Install a fresh copy of DiePi from the VMWare image and let it upgrade to the current release and then reboot.
  2. Set a static IP address in the DietPi network settings
  3. Install Pi-hole from the DietPi optimized software list

Expected behaviour:

Actual behaviour:

Extra details:

Fourdee commented 6 years ago

@Kreeblah

Many thanks for the report 👍

here were a ton of getcwd() and file not found errors during the installation

Interesting, ok. Appears to be a failed installation at basic file level.

I'll try to replicate

Testing:


80e3b5f9-5236-4dbf-adf2-c04397740817

Fourdee commented 6 years ago

Beta: image

root@DietPi:~# G_PROGRAM_NAME=test G_INIT
root@DietPi:/tmp/test# 
Fourdee commented 6 years ago
shell-init: error retrieving current directory: getcwd:

Using a fresh image, that wasn't used for previous dev testing: 🈯️ Fresh VB image with 2.1GB RAM 🈴 failed on 2nd test

dietpi-software install 93
#fail

🈴 Fresh VB image with 1GB RAM

tmpfs /tmp tmpfs defaults,size=1023M,noatime,nodev,nosuid,mode=1777 0 0
root@DietPi:~# free -m
              total        used        free      shared  buff/cache   available
Mem:            996          23         915           5          57         871
Swap:          1051           0        1051
root@DietPi:~# ls -lha /tmp
total 4.0K
drwxrwxrwt  7 root root  140 Nov 12 17:53 .
drwxr-xr-x 22 root root 4.0K Sep 20 13:08 ..
drwxrwxrwt  2 root root   40 Nov 12 17:52 .font-unix
drwxrwxrwt  2 root root   40 Nov 12 17:52 .ICE-unix
drwxrwxrwt  2 root root   40 Nov 12 17:52 .Test-unix
drwxrwxrwt  2 root root   40 Nov 12 17:52 .X11-unix
drwxrwxrwt  2 root root   40 Nov 12 17:52 .XIM-unix
root@DietPi:~# umount /tmp
root@DietPi:~# ls -lha /tmp
total 40K
drwxrwxrwt  9 root root 4.0K Sep 20 13:04 .
drwxr-xr-x 22 root root 4.0K Sep 20 13:08 ..
drwxr-xr-x  2 root root 4.0K Aug 16 15:32 DietPi-Drive_Manager
drwxr-xr-x  2 root root 4.0K Aug 16 15:31 DietPi-PREP
drwxrwxrwt  2 root root 4.0K Aug 16 15:27 .font-unix
-rw-r--r--  1 root root  579 Aug 16 15:32 G_ERROR_HANDLER_COMMAND
drwxrwxrwt  2 root root 4.0K Aug 16 15:27 .ICE-unix
drwxrwxrwt  2 root root 4.0K Aug 16 15:27 .Test-unix
drwxrwxrwt  2 root root 4.0K Aug 16 15:27 .X11-unix
drwxrwxrwt  2 root root 4.0K Aug 16 15:27 .XIM-unix
root@DietPi:~# G_DEBUG=1 dietpi-software install 93
[  OK  ] DietPi-Software | Root access verified.
[  OK  ] DietPi-Software | RootFS R/W access verified.

 DietPi-Software
─────────────────────────────────────────────────────
 Mode: Running G_INIT()

[ INFO ] DietPi-Software | Entered scripts working directory: /tmp/DietPi-Software
[  OK  ] DietPi-Software | Initialized database
[ .... ] DietPi-Software | Reading database, please wait.../tmp/DietPi-Software
total 8.0K
drwxr-xr-x 2 root root 4.0K Nov 12 18:18 .
drwxrwxrwt 8 root root 4.0K Nov 12 18:18 ..
[ INFO ] DietPi-Software | Navigated to /tmp
[ INFO ] DietPi-Software | Removed scripts working directory: /tmp/DietPi-Software
[  OK  ] DietPi-Software | Reading database completed

 DietPi-Software
─────────────────────────────────────────────────────
 Mode: Automated install

[  OK  ] DietPi-Software | Installing Pi-hole: block adverts for any device on your network
[  OK  ] DietPi-Software | Free space check: path=/ | available=6899 MB | required=500 MB
[  OK  ] DietPi-Software | DietPi-Userdata validation: /mnt/dietpi_userdata
[  OK  ] DietPi-Software | Connection test: https://deb.debian.org/debian/
shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
[ SUB1 ] DietPi-Run_ntpd > Running G_INIT()
chdir: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
[ INFO ] DietPi-Run_ntpd | Entered scripts working directory: /tmp/DietPi-Run_ntpd
[  OK  ] NTPD: time sync | Completed
/tmp/DietPi-Run_ntpd
total 8.0K
drwxr-xr-x 2 root root 4.0K Nov 12 18:18 .
drwxrwxrwt 8 root root 4.0K Nov 12 18:18 ..
[ INFO ] DietPi-Run_ntpd | Navigated to /tmp
[ INFO ] DietPi-Run_ntpd | Removed scripts working directory: /tmp/DietPi-Run_ntpd
shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
[ SUB1 ] DietPi-Services > Running G_INIT()
chdir: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
[ INFO ] DietPi-Services | Entered scripts working directory: /tmp/DietPi-Services
[ SUB1 ] DietPi-Services > unmask
[  OK  ] DietPi-Services | unmask all: cron
/tmp/DietPi-Services
total 8.0K
drwxr-xr-x 2 root root 4.0K Nov 12 18:18 .
drwxrwxrwt 8 root root 4.0K Nov 12 18:18 ..
[ INFO ] DietPi-Services | Navigated to /tmp
[ INFO ] DietPi-Services | Removed scripts working directory: /tmp/DietPi-Services
shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
[ SUB1 ] DietPi-Services > Running G_INIT()
chdir: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
[ INFO ] DietPi-Services | Entered scripts working directory: /tmp/DietPi-Services
[ SUB1 ] DietPi-Services > stop
[  OK  ] DietPi-Services | stop : cron
/tmp/DietPi-Services
total 8.0K
drwxr-xr-x 2 root root 4.0K Nov 12 18:18 .
drwxrwxrwt 8 root root 4.0K Nov 12 18:18 ..
[ INFO ] DietPi-Services | Navigated to /tmp
[ INFO ] DietPi-Services | Removed scripts working directory: /tmp/DietPi-Services
/DietPi/dietpi/dietpi-software: line 14354: cd: /tmp/DietPi-Software: No such file or directory

 DietPi-Software
─────────────────────────────────────────────────────
 Mode: Update & upgrade APT

^C
Fourdee commented 6 years ago

Bug with current VB image. Will cause update to also fail. 🈯️

[ .... ] DietPi-Software | Reading database, please wait.../tmp/DietPi-Software
total 8.0K
drwxr-xr-x 2 root root 4.0K Nov 12 18:18 .
drwxrwxrwt 8 root root 4.0K Nov 12 18:18 ..
[ INFO ] DietPi-Software | Navigated to /tmp
[ INFO ] DietPi-Software | Removed scripts working directory: /tmp/DietPi-Software

####
G_INIT_ALLOW_CONCURRENT=1 dietpi-software install 93

Image needs redoing.

Will also check VMware image:

Fourdee commented 6 years ago
[ INFO ] DietPi-Software | Entered scripts working directory: /tmp/DietPi-Software
[  OK  ] DietPi-Software | Initialized database
[ .... ] DietPi-Software | Reading database, please wait... <<<<< the cause of G_EXIT trigger

G_EXIT for DietPi-Software
PWD=/tmp/DietPi-Software
[  OK  ] DietPi-Software | Reading database completed
[ INFO ] DietPi-Software | Navigated to /tmp

Triggers G_EXIT:

G_DIETPI-NOTIFY -2 'Reading database'
G_DIETPI-NOTIFY 0 'Reading database' # << triggers the G_EXIT

🈯️ Setting sync at start of G_DIETPI-NOTIFY resolves issue...


root@DietPi:~# G_DEBUG=1 dietpi-software install 93 | grep EXIT
/tmp/dietpi-process.pid PID KILL (EXIT)
/tmp/dietpi-process.pid PID KILL (EXIT)
DietPi-Software Running G_EXIT()
/tmp/dietpi-process.pid PID KILL (EXIT)
DietPi-Software Running G_EXIT()

🈯️ sync before checking for /tmp/dietpi-process.pid

root@DietPi:~# G_DEBUG=1 dietpi-software install 93 | grep EXIT
/tmp/dietpi-process.pid PID KILL (EXIT)
[.     ] /tmp/dietpi-process.pid PID KILL (EXIT)
DietPi-Software Running G_EXIT()

🈯️ sync

                set -o noclobber
                if { > /tmp/dietpi-process.pid; } &> /dev/null; then

                    set +o noclobber
                    { Print_Process_Animation & echo $! > /tmp/dietpi-process.pid; disown; } 2> /dev/null
                    echo -e "$G_PROGRAM_NAME | $! > /tmp/dietpi-process.pid (EXIT)"
                    sync
Fourdee commented 6 years ago

Left = before fix Right = after fix


@MichaIng Reason why this only occurs on VM? Unsure, cache/disk/delay issue with VM's under noclobber modes?

Another solution is to precreate the blank file, beforehand, then wait for value to be valid:

                set -o noclobber
                > /tmp/dietpi-process.pid
                if { > /tmp/dietpi-process.pid; } &> /dev/null; then

                    set +o noclobber
                    { Print_Process_Animation & echo $! > /tmp/dietpi-process.pid; disown; } 2> /dev/null
                    #sync

                else

                    rm /tmp/dietpi-process.pid
                    set +o noclobber

                fi
        Clean_Process_Animation(){

            while [[ -f /tmp/dietpi-process.pid ]]
            do

                local pid=$(</tmp/dietpi-process.pid)

                if [[ -t 0 ]] && [[ $pid ]]; then

                    kill $pid &> /dev/null
                    rm /tmp/dietpi-process.pid &> /dev/null
                    # In case, the output took more than one line, clean from cursor (animation position) until end of terminal.
                    tput ed
                    break

                fi

                echo -e "G_EXIT sleeping, file but no value"
                sleep 1

            done

            output_string+='\r\e[K'

        }
Fourdee commented 6 years ago

Needs more testing. sync may resolved, but so does sleep 0.1. sync may simply create enough time for bash/PID's/system to catch up.

MichaIng commented 6 years ago

@Fourdee Strange, I don't understand yet the reason.

G_INIT_ALLOW_CONCURRENT=1 dietpi-software install 93

This was just a test, right? Very dangerous the option, should be never used in production or within release code. If ever, then better implement an option that skips /tmp/$G_PROGRAM_NAME creation as well: G_INIT_NO_TMPDIR or something like this, if the script does not need to create any tmp files? Could be also realized via argument e.g. G_INIT 0 (no /tmp/dir) But this breaks concurrency check, which for my impression should be always done. Unpredictable issues might occur, depending on script, when run two times concurrently...

root@DietPi:\~# umount /tmp root@DietPi:~# ls -lha /tmp

We should clean /tmp before mounting tmpfs there during PREP or whenever done within our scripts 😉.

And just to be sure, this issue does not only occur on first run install then, right?

About file system sync:

But now, if somehow it's true and /tmp/dietpi-process.pid is not created in file system but kept in cache:

The problem with using a PID variable is that it is only available in current shell (and afterwards opened sub shells, if exported), but no chance to have it available (or change it) in parent script/shell. So the only chance to kill the process animation would be to actively kill it from the very same shell/script.

Also, if really async is done on tmpfs as well, we should properly mount it with sync option to override? Although (see above) I just found it has 4k block size. Hmm the defaults should have some reason, so better not mess with this.

Fourdee commented 6 years ago

@MichaIng

This was just a test, right?

Yep, all above was debug testing, trying to find cause of the unexpected G_EXIT call.

We should clean /tmp before mounting tmpfs there during PREP or whenever done within our scripts

yep 👍

And just to be sure, this issue does not only occur on first run install then, right?

Occurs after 1st run installation, and during boot (briefly on VM): image

Add program_name to G_INIT image


~Following disables errors, checking that G_PROGRAM_NAME exists.~ Worked once...

if [[ $G_PROGRAM_NAME && -d /tmp/$G_PROGRAM_NAME ]]; then
Fourdee commented 6 years ago

@MichaIng

🈯️ Fixed the boot issue, use the binary instead of variable $PWD to get current directory:

image


So either:

Fourdee commented 6 years ago

Regarding traps:

http://redsymbol.net/articles/bash-exit-traps/ If some error causes the script to exit prematurely, ~the scratch directory and its contents don't get deleted.~ This is a resource leak?

Hmm, unsure, but dietpi-software issues. Disabling this resolves.

G_DIETPI-NOTIFY -2 'Reading database'
Fourdee commented 6 years ago

@MichaIng

🈯️ Think i've fixed it:

Would indicate the process is not terminated fast enough using SIGTERM


SIGTERM: But all this does is trigger the while loop and sleep everytime, allowing time for it to complete.

                while kill -15 $(</tmp/dietpi-process.pid) &> /dev/null
                do

                    echo -e "waiting for process to terminate"
                    sleep 0.1

                done
MichaIng commented 6 years ago

This is definitely the cause of the dietpi-software issues. Disabling this resolves.

I don't get the connection yet. The AMI instance in the guides example is our animation process? If the dietpi-software exit trap does not terminate it, it will run forever. Jep makes sense, for this reason currently the PS1 prompt command terminates the animation as last resort. However it makes sense to do this within the exit trap as well. PS1 prompt command will then only do this, if some G_* command from terminal calls animation and is cancelled or fails.

About $PWD:

Perhaps it is easiest/fastest to simply always cd /tmp before removing working dir and skip the test completely?

But now about the actual debug log:

 DietPi-Software
─────────────────────────────────────────────────────
 Mode: Running G_INIT()

[ INFO ] DietPi-Software | Entered scripts working directory: /tmp/DietPi-Software
[  OK  ] DietPi-Software | Initialized database
[ .... ] DietPi-Software | Reading database, please wait.../tmp/DietPi-Software
total 8.0K
drwxr-xr-x 2 root root 4.0K Nov 12 18:18 .
drwxrwxrwt 8 root root 4.0K Nov 12 18:18 ..
[ INFO ] DietPi-Software | Navigated to /tmp
[ INFO ] DietPi-Software | Removed scripts working directory: /tmp/DietPi-Software
[  OK  ] DietPi-Software | Reading database completed

Thinking now about this, I believe it is due to: /DietPi/dietpi/func/dietpi-set_dphys-swapfile remounting /tmp, which leads to content and cwd loss. This is called during preboot 1st run setup, after which the error appeared and during Pi-hole install, after which error was reported. 🈯️ Makes totally sense!

But how does pwd now fix the issue? Perhaps it also corrects the current dir, if non-existent, re-creates it or sets it to parent dir or $HOME or such?

Last question is still why the dietpi-software EXIT trap was actually called during database reading above. /tmp remount and wrong $PWD cannot be the issue since on 1st run setup, dietpi-software is definitely called long after preboot resets /tmp mount and database reading is done before Pi-hole install code remounts it. Since the script goes on, no EXIT call was done, obviously.

Puhh looks to me that all of these issues are not related to each other. ~The real issue, which is the cause for the TO and your replicated 1st boot/install error is definitely swapfile creation remounting /tmp, leading to content being cleared.~

Fourdee commented 6 years ago

@MichaIng

Thinking now about this, I believe it is due to: /DietPi/dietpi/func/dietpi-set_dphys-swapfile remounting /tmp, which leads to content and cwd loss. This is called during preboot 1st run setup, after which the error appeared and during Pi-hole install, after which error was reported. 🈯️ Makes totally sense!

The boot issue occurs at all times https://github.com/Fourdee/DietPi/issues/2237#issuecomment-438303603, even after 1st run setup. At that point, /tmp is mounted before all dietpi scripts.

            73ms dietpi-ramlog.service
            72ms dropbear.service
            63ms systemd-tmpfiles-setup-dev.service
            60ms systemd-remount-fs.service
            49ms systemd-timesyncd.service
            47ms systemd-sysctl.service
            45ms systemd-journal-flush.service
            40ms systemd-tmpfiles-setup.service
            38ms systemd-modules-load.service
            38ms dev-hugepages.mount
            37ms console-setup.service
            37ms systemd-update-utmp.service
            36ms dev-mqueue.mount
            35ms tmp.mount
            35ms sys-kernel-debug.mount
            32ms systemd-tmpfiles-clean.service
            28ms systemd-random-seed.service
            25ms ssh.service
            20ms var-log.mount
            20ms kmod-static-nodes.service
            12ms DietPi.mount
            12ms systemd-update-utmp-runlevel.service
            11ms systemd-user-sessions.service

systemd_boot

MichaIng commented 6 years ago

@Fourdee Jep, also found swap file creation, including mount -o remount,size=753M tmpfs /tmp did not clear /tmp dir. Current $PWD and even contained files are reserved. Seems the content backup is somehow done automatically 👍. So no issue with this!


$PWD and pwd both do not check/recognize that the directory does not exist:

root@VM-Stretch:/tmp/testdir# rm -R /tmp/testdir
root@VM-Stretch:/tmp/testdir# echo $PWD
/tmp/testdir
root@VM-Stretch:/tmp/testdir# pwd
/tmp/testdir
root@VM-Stretch:/tmp/testdir# l
total 0
root@VM-Stretch:/tmp/testdir# cd ..
root@VM-Stretch:/tmp# l
total 0
drwxrwxrwt 2 root root 40 Nov 13 18:35 .font-unix
drwxrwxrwt 2 root root 40 Nov 13 18:35 .ICE-unix
drwxrwxrwt 2 root root 40 Nov 13 18:35 .Test-unix
drwxrwxrwt 2 root root 40 Nov 13 18:35 .X11-unix
drwxrwxrwt 2 root root 40 Nov 13 18:35 .XIM-unix
Fourdee commented 6 years ago

@MichaIng

Tested exit traps up and down with/without disown, immediate and after a while terminating, from inside and outside the background job, within/outside { ... } &> ...

[ INFO ] DietPi-Software | Entered scripts working directory: /tmp/DietPi-Software
[.     ] /DietPi/dietpi/func/dietpi-globals: line 270:  2455 Terminated              Print_Process_Animation
[  OK  ] DietPi-Software | Initialized database
[ INFO ] DietPi-Software | Navigated to /tmp
[  OK  ] DietPi-Software | Reading database completed
[ INFO ] DietPi-Software | Removed scripts working directory: /tmp/DietPi-Software
/DietPi/dietpi/dietpi-software: line 1:  2461 Terminated              Print_Process_Animation
MichaIng commented 6 years ago

@Fourdee

Would indicate the process is not terminated fast enough using SIGTERM

But why this is an issue? Even if the kill command does it's job in the background, while script goes on, this would at worst lead to a parallel animation, until kill has finished. But this should not lead to any folder deletion or affect cwd?


Quick validation about tmpfs remount: http://man7.org/linux/man-pages/man5/tmpfs.5.html

   *  During a remount operation (mount -o remount), the filesystem size
      can be changed (without losing the existing contents of the
      filesystem).

👍

Fourdee commented 6 years ago

@MichaIng

But why this is an issue? Even if the kill command does it's job in the background, while script goes on, this would at worst lead to a parallel animation, until kill has finished. But this should not lead to any folder deletion or affect cwd?

Unsure at moment. Maybe SIGTERM is allowing a process/memory leak to occur in Print_Process_Animation when terminated.

Example is:

Quick validation about tmpfs remount:

Nice 👍

Fourdee commented 6 years ago

Intresting, this works: tput ed on exit

[[ -w /tmp/dietpi-process.pid ]] && echo -ne "\r$bracket_l${aprocess_string[i]}$bracket_r " || tput ed && return

Then simply rm /tmp/dietpi-process.pid in process clean.

So issue is leakage when terminating Print_Process_Animation using SIGTERM?


Breaks animation lol

putting tput ed back after Clean_Process_Animation resolved.

Simply not killing the process is the fix I believe. Preventing leakage and allowing graceful exit.

And this should be exit as we don't want to return any value/info?

[[ -w /tmp/dietpi-process.pid ]] && echo -ne "\r$bracket_l${aprocess_string[i]}$bracket_r " || exit 0

$PWD issue still occurs with the above.

Removing the code and simply using the following also causes the error, so cd /tmp is what is failing here?

                if cd /tmp; then

                    [[ $G_DEBUG == 1 ]] && G_DIETPI-NOTIFY 2 'Navigated to /tmp'

                else

                    [[ $G_DEBUG == 1 ]] && G_DIETPI-NOTIFY 2 "Failed to navigate out of /tmp/$G_PROGRAM_NAME"

                fi

                if (( ! $G_INIT_ALLOW_CONCURRENT )); then

                    if rm -R /tmp/$G_PROGRAM_NAME; then

                        [[ $G_DEBUG == 1 ]] && G_DIETPI-NOTIFY 2 "Removed scripts working directory: /tmp/$G_PROGRAM_NAME"

                    else

                        [[ $G_DEBUG == 1 ]] && G_DIETPI-NOTIFY 2 "Failed to removed scripts working directory: /tmp/$G_PROGRAM_NAME"

                    fi

                fi
Fourdee commented 6 years ago

@MichaIng

If you need VNC access for the boot issue, let me know i'll set it up?

MichaIng commented 6 years ago

@Fourdee But the above is already live code? Only tput ed added, but what is the influence of this? Ah, not required in Clean_Process_Animation then. But what I don't like about it:

I still not get it 🤔. Made more research and more testing and still unable to find any way to make a child termination call the parent exit trap... And even if something goes wrong with the PID, e.g. somehow parent PID is saved to PID file, then the parent script would exit as well, which it does not...

So even that we have some solutions with cd /tmp and $(pwd) that seem to solve the issue, I would like to understand how dietpi-software exit trap can be called, when child background process is terminated and without dietpi-software actually exiting. So it receives an EXIT signal but goes on working?? I have headache now 🤣.

€: Jep, since I can't replicate the issue at all on my Stretch VM (reset to use $PWD, re-enable fs resize service, echo -1 > .install_stage, reboot), VNC access would be good, so I can play around myself. Which client do you use/recommend for Windows system? First find was RealVNC viewer. The Windows internal remote desktop client does not work, right?

Fourdee commented 6 years ago

@MichaIng

TightVNC works well, only install the viewer. Setting it up now.

Fourdee commented 6 years ago

@MichaIng

82.7.94.230 same pw as webserver.

MichaIng commented 6 years ago

Okay long testing session:

Fourdee commented 6 years ago

@MichaIng

Thanks Micha, indeed a very long debugging session 😄 Thanks for your help 👍

Some notes my end:

dietpi-software install 93 still needs investigation. However process animation will be killed via kill -9 (SIGKILL vs SIGTERM), just in case and it's faster.

Removal of kill command and allowing bg process to terminate on its own when .pid file is removed, also worked.

Any Pipe with preboot script and multiple + threads in the script, was triggering the getcwd errors during boot.

MichaIng commented 6 years ago

Removal of kill command and allowing bg process to terminate on its own when .pid file is removed, also worked.

The only issue with this is the max 0.15 seconds delay between removal and termination:

This only works, if the animation process checks content of PID file, verifying that it's still his own PID [[ $(</tmp/dietpi-process.pid) == $BASHPID ]]. But decreases performance 🤔. For my impression active process termination before removing PID file (thus allowing new animations) is the better deal then.

Any Pipe with preboot script and multiple + threads in the script, was triggering the getcwd errors during boot.

Still not sure, where exactly the getcwd error came from, but they did not appear after forced cd /tmp any more, right? At least the multiple bg processes were not related.

MichaIng commented 6 years ago

I made a start: https://github.com/Fourdee/DietPi/pull/2248

Fourdee commented 6 years ago

@MichaIng

Still not sure, where exactly the getcwd error came from, but they did not appear after forced cd /tmp any more, right? At least the multiple bg processes were not related.

Still occurring on my tests.

Believe you were right with a /tmp mount issue as following does not create log:

ExecStart=/bin/bash -c '/DietPi/dietpi/preboot &>> /tmp/dietpi-preboot.log'

🈯️ However, this does:

ExecStart=/bin/bash -c '/DietPi/dietpi/preboot &>> /root/dietpi-preboot.log'

~Interesting:~

~Something is playing with /tmp~


&>> /tmp/dietpi-preboot.log

root@DietPi:~# cat /var/tmp/dietpi/logs/dietpi-preboot.log
shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
[  OK  ] Root access verified.
[  OK  ] Root access verified.
chdir: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
chdir: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory

Notice: CPU Governors are not available for VM.

[ SUB1 ] DietPi-LED_Control > Applying LED triggers
[  OK  ] DietPi-LED_Control | input0::capslock: kbd-capslock
[  OK  ] DietPi-LED_Control | input0::numlock: kbd-numlock
[  OK  ] DietPi-LED_Control | input0::scrolllock: kbd-scrolllock

Disable StandardOutput=tty and all threads in preboot, blob? image

Fourdee commented 6 years ago

@MichaIng

Ok, still unsure of cause. Although:

So for now, I believe we should roll out a workaround fix to disable threading, until more time is available to debug this further?


Also tried: G_THREAD_START, instead of &, same errors in output.

Fourdee commented 6 years ago

Redo images: Using dev branch, switch dietpi.txt and .version afterwards.

G_CONFIG_INJECT 'DEV_GITBRANCH=' 'DEV_GITBRANCH=master' /boot/dietpi.txt
G_CONFIG_INJECT 'G_GITBRANCH=' 'G_GITBRANCH=master' /boot/dietpi/.version
G_CONFIG_INJECT 'G_DIETPI_VERSION_RC=' 'G_DIETPI_VERSION_RC=20' /boot/dietpi/.version


~Stretch images done, uploaded to testing folder, however, updates to master branch (with the current issues) during 1st run due to RC lower version.~ Redone.

MichaIng commented 6 years ago

@Fourdee

shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
[  OK  ] Root access verified.
[  OK  ] Root access verified.
chdir: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
chdir: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory

Does it solve to actively do cd /tmp before initiating background jobs?

Fourdee commented 6 years ago

@MichaIng

Does it solve to actively do cd /tmp before initiating background jobs?

Nope, tried cd /tmp; command &, still occurs

MichaIng commented 6 years ago

Best would be to let the trap wait for those to finish: https://stackoverflow.com/a/356154

So test schedule:

Fourdee commented 6 years ago

Lowering priority level as both Stretch VM images have been updated with the fix for getcwd issues:

@Kreeblah

Please re-download required VM image in above link. Resolves the issue you experienced.

MichaIng commented 5 years ago

Okay went on with testing:

As mentioned above, after the bg jobs are initiated, preboot does not wait for them to finish with the EXIT trap: cd /tmp and rm -R /tmp/DietPi-PreBoot. Most properly the bg job init (+ cd /tmp/$G_PROGRAM_NAME) and the EXIT traps steps are too close together, breaking each others sometimes.

Another idea to reduce boot time is to scan for actual existent config files before calling set_cpu/led functions (and by this skipping all globals/INIT tasks). It should be at best possible then as well to reset settings to system defaults and removing the config files then. E.g. on headless systems, you don't care about the LEDs and on VMs, no CPU handling available anyway.

Fourdee commented 5 years ago

@MichaIng

Excellent debugging + fix 👍

Maybe we could try G_THREAD_START on these and G_THREAD_WAIT before exit, i'll run some tests.

Fourdee commented 5 years ago

@MichaIng

🈯️ G_THREAD_* image

MichaIng commented 5 years ago

@Fourdee Ah lol jep, totally forgot that we already have the function set for this 👍.

Jep should totally work with this. However thread internal output is missing then and still the question is, if there is really a boot time benefit with this. Needs to be tested on non-VM, I guess, where set_cpu/led really apply changes.

Btw. to have further boot ouput, remove quiet from /etc/default/grub boot line + update-grub 😉. Just to assure that really the rm -R /tmp/DietPi-PreBoot is done after set_cpu/led has finished.

Fourdee commented 5 years ago

Testing:

Fourdee commented 5 years ago

Marking as completed. Issue is now resolved by waiting for background threads to finish, before script exit.