MoneroOcean / xmrig

Monero (rx/0, rx/wow, rx/loki, defyx, rx/arq, rx/sfx, rx/keva, cn/0, cn/1, cn/2, cn/r, cn/fast, cn/half, cn/xao, cn/rto, cn/rwz, cn/zls, cn/double, cn/gpu, cn-lite/0, cn-lite/1, cn-heavy/0, cn-heavy/tube, cn-heavy/xhv, cn-pico, cn-pico/tlo, argon2/chukwa, argon2/wrkz, astrobwt) CPU/GPU miner
https://moneroocean.stream
GNU General Public License v3.0
273 stars 86 forks source link

Segmentation faults on 2 ARM devices on Algo switch #49

Closed Koesters closed 3 years ago

Koesters commented 3 years ago

I have 4 ARM systems 1 Rock64 2 NanoPC-T4 1 Odroid N2+

On the Odroid and 1 NanoPC-T4 algo switching seems unstable. The other two work fine. They work for weeks on xmrig vanilla and I tested them since yesterday with vanilla again and 0 problems. Temperatures are OK. The N2+ is at 30 degrees actively cooled on performance monitor. The failing Nano is at 70 passively cooled on interactive monitor.

I did not edit the config beside User and Rig-ID, then the test etc.

Looks to me that this combo seems to fail gcc/9.3.0 LIBS libuv/1.34.2 OpenSSL/1.1.1f hwloc/2.5.0a1-git

As it's ARM i need to self compile. The 2 that do not work where compiled later.

I also noticed they failed within the same second, even milliseconds apart. This seems unlikely given they are different hardware that the problem is internal. 3 ARM systems are on the same switch 1 the rock has no problems.

All my Intels (including Intel MAC) and AMD's don't have issues. I have 13 workers.

NanoPC-T4

ODROID

[2021-04-15 03:32:37.638] cpu use argon2 implementation default [2021-04-15 03:32:37.749] cpu stopped (111 ms) [2021-04-15 03:32:37.750] randomx init dataset algo panthera (6 threads) seed 7d60fe93dad74a5e... [2021-04-15 03:32:38.026] randomx allocated 2336 MB (2080+256) huge pages 100% 1168/1168 +JIT (277 ms) [2021-04-15 03:32:38.837] randomx dataset ready (810 ms) [2021-04-15 03:32:38.837] cpu use profile panthera (6 threads) scratchpad 256 KB [2021-04-15 03:32:38.838] cpu READY threads 6/6 (6) huge pages 100% 6/6 memory 1536 KB (2 ms) [2021-04-15 03:32:48.403] miner speed 10s/60s/15m n/a n/a n/a H/s max n/a H/s [2021-04-15 03:32:50.442] cpu accepted (514/0) diff 25084 (1596 ms) [2021-04-15 03:33:48.450] miner speed 10s/60s/15m 460.2 459.7 n/a H/s max 460.2 H/s [2021-04-15 03:34:48.483] miner speed 10s/60s/15m 459.4 459.5 n/a H/s max 460.7 H/s [2021-04-15 03:34:58.378] cpu accepted (515/0) diff 25084 (1026 ms) [2021-04-15 03:35:00.970] cpu accepted (516/0) diff 25084 (1026 ms) [2021-04-15 03:35:12.699] cpu accepted (517/0) diff 25084 (1027 ms) [2021-04-15 03:35:47.644] net new job from gulf.moneroocean.stream:10008 diff 208029 algo rx/arq height 668914 [2021-04-15 03:35:47.660] cpu stopped (16 ms) [2021-04-15 03:35:47.660] randomx init dataset algo rx/arq (6 threads) seed 128d925cda4fcdaf... Segmentation fault

Works on;

NANOPC 2

Rock 64

Spudz76 commented 3 years ago

Those are all some super-oddball and untested versions of libs.

Build the versions of deps that are included, using the provided ./scripts/build_deps.sh which should result in:

 * LIBS         libuv/1.41.0 OpenSSL/1.1.1j hwloc/2.4.1

And we can debug from there. Unknown lib versions can't be a basis for any science about the problem. I would currently suspect mostly libuv being too old, or not old enough (since 1.8.x or 1.18.x are on ones that switch okay, but the broken ones are both 1.34.x which could be a buggy notch).

Koesters commented 3 years ago

Now it runs stable on both since days. One gave me still trouble until a hwloc and libhwloc-dev (apt) update, deletion of all cmake files and a recompile.

Spudz76 commented 3 years ago

Strange the proxy is using ancient OpenSSL while the miners are using more current ones.

Koesters commented 3 years ago

Strange the proxy is using ancient OpenSSL while the miners are using more current ones.

The XU4 is a 5 years old 32 bit system with specially made Linux. As such the depos are outdated. At best I think I could update to 18.04. https://wiki.odroid.com/odroid-xu4/os_images/linux/ubuntu_4.14/ubuntu_4.14

I have more such dead hardware. Nvidia Jetson TK1 which essentially stops at 14.04. But they can at least do smaller tasks.

The next would be the Rock64 with a special Ubuntu as well.

The most modern with 20.04 etc are the NanoPc-T4 and the N2+.

Spudz76 commented 3 years ago

Interesting collection... I have a box full of various old stuff but mostly MIPS32 router/SBC things which are even less useful, even if given fresher software builds but 32-bit is very abandoned now. At least the proxy works at all! :)

Koesters commented 3 years ago

Interesting collection... I have a box full of various old stuff but mostly MIPS32 router/SBC things which are even less useful, even if given fresher software builds but 32-bit is very abandoned now. At least the proxy works at all! :)

It also ran an AIS multiplexer and AIS decoder for years in a room with high temperature variations due to the necessity to be relatively close to the antennas, as it also has various non SDR, real AIS receivers attached.