Open polkadotbabe opened 3 years ago
hard to say what's going on, monitor RAM usage and make sure it's not running out, also try 256 buckets
[Jun11 17:10] process '/chia_plot' started with executable stack [Jun11 17:16] phase1/eval/6[1967]: segfault at 1 ip 00007efc55ec3b00 sp 00007efc55ec2e38 error 6 [ +0.000035] Code: 00 00 00 bb 47 a1 3c 8f 0b fc 00 00 000 00 47 fc 55 fc 7e 00 00 00 00 00 00 00 00 00 00 c3 3d 0d 5f fc 7e 00 00 <00> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Thanks for responding so quickly, much appreciated :-) that's from dmesg -wH output
That time it crashed chia_plot but the system stayed up.
Restarted chia_plot with -b 256 and it hard-locked, rebooting to see the dump... ...nothing in /var/crash... hmmm....
[Jun11 17:10] process '/chia_plot' started with executable stack [Jun11 17:16] phase1/eval/6[1967]: segfault at 1 ip 00007efc55ec3b00 sp 00007efc55ec2e38 error 6 [ +0.000035] Code: 00 00 00 bb 47 a1 3c 8f 0b fc 00 00 000 00 47 fc 55 fc 7e 00 00 00 00 00 00 00 00 00 00 c3 3d 0d 5f fc 7e 00 00 <00> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Thanks for responding so quickly, much appreciated :-) that's from dmesg -wH output
That time it crashed chia_plot but the system stayed up.
Restarted chia_plot with -b 256 and it hard-locked, rebooting to see the dump... ...nothing in /var/crash... hmmm....
Try these build instructions and let me know. Validated on Cent7
https://gist.github.com/jollyjollyjolly/d8904efda4d5997a2f0e9caf31cff1c3
Stream doesn't have gmp-static anymore, linking error still fails to build with:
/bin/ar: /usr/lib64/libgmp.so: File format not recognized
But your link is basically how I built on 7 to create the binary...
The bios is incorrectly identifying the memory as ddr4-2133 CL15, here's what it actually is: Memory Speed (MHz) DDR4-3600, PC4-28800, PC4-28800, CAS Latency 16, Memory Latency Timings 16-19-19-39
@madMAx43v3r should I set the bios to the exact chip specs, or leave it on "auto".
Nothing in /var/crash as it wasn't rebooted with crashkernel=auto...
Crashed again. /var/crash didn't capture anything, despite kdump is installed, crashkernel=auto is in grub, but kdump won't load, throws a no memory error: kdumpctl[18715]: kdump: No memory reserved for crash kernel
Updated the bios, asrock z590 phantom gaming 4, intel i9-10850, g.skill ddr4-3600, we'll see if that fixes it.
Trying to figure out why kdump isn't working on stream, was always easy on Cent7/RHEL...
Definitely open to ideas if anyone sees this! :-)
Can always recommend you return to rhel7 for time being. Stream is a mess and rocky isnt primetime (yet) -------- Original message --------From: "PolkadotBABE.Com" @.> Date: 6/12/21 5:45 PM (GMT-06:00) To: madMAx43v3r/chia-plotter @.> Cc: Jerod Moore @.>, Comment @.> Subject: Re: [madMAx43v3r/chia-plotter] chia_plot hard locked Centos Stream/Z590 chipset/i9-10850K/64GB DDR4-2400 non/ecc (#303) Crashed again. /var/crash didn't capture anything, despite kdump is installed, crashkernel=auto is in grub, but kdump won't load, throws a no memory error: kdumpctl[18715]: kdump: No memory reserved for crash kernel Updated the bios, asrock z590 phantom gaming 4, intel i9-10850, g.skill ddr4-3600, we'll see if that fixes it. Trying to figure out why kdump isn't working on stream, was always easy on Cent7/RHEL... Definitely open to ideas if anyone sees this! :-)
—You are receiving this because you commented.Reply to this email directly, view it on GitHub, or unsubscribe.
"Can always recommend you return to rhel7 for time being. Stream is a mess and rocky isnt primetime (yet)"
Nicely (and politely!) put ;-) Seems like a consumer-grade chip/board error that won't be solved downgrading, although kdump would work and chia_plot would compile without all this drama, so...
FWIW, after the ASRock bios update, it's stopped crashing.
I set the bios to the exact memory specs of the chips, Vs. the much slower rate it recognized. Not sure the disparity, but the bios update also mentioned some memory handling upgrades, so.... ...dunno. @madMAx43v3r Seems bleeding-edge, consumer-grade-memory controller related, and the heavy usage of chia_plot exposing that.
@polkadotbabe Hello.
Specs: Ubuntu 20.04 Motherboard: b460m Aorus Elite with the last firmware Intel I5 10400 (6c/12Th) 32 Gb mem Tmp drive: 2x1 TB NVMe WD 750 black (RAID 0) Dst drive: HDD 10 TB Seagate
Mad Max plotter hangs with no output in the registry. I did a lot of tests. Can you help me with the memory settings? It is the last test I would do. Thank you
First make sure you're on the latest: 1) sudo apt-get update && sudo apt-get -y dist-update
If there was a kernel upgrade, reboot into it (just reboot).
Once running leave:
2) dmesg -wH
running in a terminal, it should capture a crash.
3) Are you certain the motherboard's bios is the latest? Double check.
4) Reset to default EUFI bios settings.
5) Turn off anything not needed. Sound, serial port, etc.
6) Don't overclock anything, but you can set the CPU to "turbo performance" to leave the turbo scaling at max.
7) IMPORTANT FOR ME: Take a picture of your RAM's stickers. Compare that to the purchase receipt. In the bios, set the speed and CAS latency settings to the EXACT parameters of your actual RAM. That solved it for me. Also stay current on the builds, in the chia-plotter dir:
8) git pull origin master
9) ./make_devel.sh
10) cp build/chia_plot /to/wherever/you/use/it
Good luck!
Thank you for your fast response. I will test your steps and report the results. Thank you so much!
@polkadotbabe , hello!! I found the problem and have no relation to the memories. When I ran chia_plotter mad max, I added the ampersand sign to do the background process. Like this: ./build/chiaplot -p $$$ -f $$$ -n -1 -r 12 -t /mnt/ssdtemp/ -d /media/dtanel/hdd1plot/ >> $(date '+%Y-%m-%d%H%M%S').log & And with that, the process dies at random. I don't know why, but that's the problem. I appreciate for your help. Thank you.
forking to the background is unrelated, likely you're missing some startup flags (-b 256 for one), chia_plot will tell you, so just run it with everything before the >> and see what it says....
Installed and enabled kdump so we'll see.
Locked with both the standard Cent 4.18 kernel and 5.12.4 from elrepo. The standard chia plotter with 12 parallel processes was fine.
Turned SE/Linux off...
GRUB_CMDLINE_LINUX="nomodeset net.ifnames=0 biosdevname=0 mitigations=off crashkernel=auto scsi_mod.use_blk_mq=1 pcie_aspm=off rhgb