Open Redicious opened 6 months ago
Hi and thank you for your feedback. No VI and VBR is strange, do you still got someting in /etc/cron.d/ ? And do you still got your entries collection activated in the credential stores?
Thx for your response! Files for vi and vbr are present in /etc/cron.d. And they are enabled in the credential story - yesterday I disabled and enabled a few. Which reflects in the creation date of the files in /etc/cron.d. Good to know how this works then. I totally forgot about cron.d - and relied on crontab -l
However I can see the crons in syslog, like this one:
May 14 14:30:01 sexigraf CRON[219558]: (root) CMD ( /usr/bin/pwsh -NonInteractive -NoProfile -f /opt/sexigraf/ViPullStatistics.ps1 -credstore /mnt/wfs/inventory/vipscredentials.xml -server
-sessionfile /tmp/vmw_ .key >/dev/null 2>&1)
So I took the command and run it manually: Starting the script like in the cron results in
Fatal error. Internal CLR error. (0x80131506) Aborted Even if I strip it down to the bare minimum: root@sexigraf:/opt/sexigraf# /usr/bin/pwsh -f "/opt/sexigraf/ViPullStatistics.ps1" Fatal error. Internal CLR error. (0x80131506) Aborted
If I however start pwsh and then run the script within pwsh it works.
PS /opt/sexigraf> /opt/sexigraf/ViPullStatistics.ps1 -credstore /mnt/wfs/inventory/vipscredentials.xml -server host -sessionfile /tmp/vmw_host.key Transcript started, output file is /var/log/sexigraf/ViPullStatistics.
.log 2024-05-14T15:05:44.7208885+00:00 [INFO] ViPullStatistics v0.9.1037 ....
Same for a simple script:
root@sexigraf:/opt/sexigraf# echo 'return "hello world!"' >> helloWorld.ps1 root@sexigraf:/opt/sexigraf# chmod 755 helloWorld.ps1 root@sexigraf:/opt/sexigraf# /usr/bin/pwsh -f "/opt/sexigraf/helloWorld.ps1" Fatal error. Internal CLR error. (0x80131506) Aborted root@sexigraf:/opt/sexigraf# pwsh PowerShell 7.2.17 Copyright (c) Microsoft Corporation.
https://aka.ms/powershell Type 'help' to get help.
PS /opt/sexigraf> ./helloWorld.ps1 hello world!
Looks like there is something wrong with .NET and not sexigraf. I'll keep you posted....
edit: formatting
Crazy stuff! Did you updated the appliance at some point?
Didn't update it before now - apt history and ssh log also shows noone touched it.
I couldn't figure out what caused the issue exactly. strace looked ok'ish - it just aborts. I spent hours googling and chatgpt'ing (is that the right word?) and grepping through logs... So I gave up on finding out what happened and just wanted it to be fixed. I made a snapshot, reinstalled pwsh, and now it works - it is now 7.4.1, was 7.2.17 - although I doubt it is related to the update. I think it was fixed by reinstalling, since it broke without any intentional/logged changes.
apt remove powershell-lts wget https://raw.githubusercontent.com/PowerShell/PowerShell/master/tools/install-powershell.sh wget https://raw.githubusercontent.com/PowerShell/PowerShell/master/tools/installpsh-debian.sh bash install-powershell.sh
I assume pwsh itself must have kicked the bucket. Today's bofh-excuse card says: "global warming". That must be it.
Thanks for your help!
thanks a lot for your feedback, also spent some time googling (didnt thought about chatgpting it) but by the looks of it, it sounds related to pwsh indeed. FYI i always use the latest LTS version as long as everything works fine. really really stranger issue, hope that wont affect your SexiGraf experience overall :D cheers
Hi,
just wanted to let you know: The issue came somewhat back, but with a "segmentation fault" error instead - but I think its just a different flavor due to the upgrade, since the conditions leading to it are the same.
Wich can be fixed (maybe only temporarily) with
rm ~/.cache/powershell/StartupProfileData-NonInteractive
I found this here, describing an issue where running pwsh -c or pwsh -f leads to the clr dying du to some optimization beeing stored int above file. Details can be found here. https://github.com/PowerShell/PowerShell/issues/18998
So I came up with this:
#!/bin/bash
# Define the log file path
log_file="/var/log/fixpwsh.log"
# Run the PowerShell script
result=$(/usr/bin/pwsh -f /opt/sexigraf/helloWorld.ps1)
ok_string="hello world!"
# Get the current timestamp
timestamp=$(date +"%Y-%m-%d %H:%M:%S")
# Check if the result is "Hello World!"
if [[ "$result" == "$ok_string" ]]; then
echo "[$timestamp] Script returned '$ok_string', quitting." | tee -a "$log_file"
else
# removing the profile data file
rm ~/.cache/powershell/StartupProfileData-NonInteractive
echo "[$timestamp] Script returned something other than '$ok_string', removed file: StartupProfileData-NonInteractive" | tee -a "$log_file"
fi
And now I run it as cron every hour...
what kind of CPU are you running?
2x Intel Xeon Silver 4210
can you try to upgrade the vHardware on the sexigraf vm just to test?
also, did you install any security tool in the appliance?
also does it have access to internet?
any updates?
Hey,
my Sexigraf instance stopped pulling data at 2nd May 4:00 CEST from all VBR and unmanaged ESXis (there is nothing else in the inventory). (I've been on vacation, and sexigraf is not used yet, still in trial - thats why I'm late to the party)
So grafana only shows its own metrics.
In /var/log/sexigraf/ there are no new logs for VbrPullSatistics etc.
Only logs which geht updated are
There is also no change in the patterns for when it stopped pulling data - so no error message hinting at what could go wrong.
I also checked the standard stuff... There is enough disk space, inodes left, ram is good, cpu is good, etc.
If I run ViPullStatistics.ps1 manually it also works, and I have a set of data points for the ESXi it is run for.
Since ViPullStatistics.ps1 basically works, and there are no Transcripts in /var/log/sexigraf: what does invoke it? Where should I look next? There is nothing in crontab...
Cheers Red