Open Slyphic opened 2 years ago
We create those packages using a Raspberry Pi and never had a problem. The first step would be to try and compile MooseFS from source (with symbols) on the exact machine it is later run on and see, if the problem persists. If yes, then we can try to investigate what exactly seems to be the issue. If no, then it would mean those packages simply are not compatible with ODROIDs. Are you able to compile MooseFS from source? BTW the difference between the .112 and .116 might be simply because .112 was compiled on older OS and kernel versions.
Tagging in so I can see updates on this issue.
System information
v3.0.116 installed from ArchLinux|ARM repo, running kernel 4.14.180 on ODROID-HC2 (Samsung Exynos5422 ARM Cortex-A15/Cortex-A7)
1 master and 1 metadatalogger as above, and 3-4 chunk nodes running the same mfs and OS on Espressobins (Marvell Armada 3700LP (88F37200) ARM/Cortex A53) each with 3-4 2TB disks
I know, it's a weird installation. It's my home test bed, an attempt at an ultra low power (and low cost) moosefs installation. It's been running largely without problems for almost 5 years now.
Describe the problem you observed:
After upgrading from 3.0.112 to 3.0.116 mfs-master core dumps and crashes during normal operation. I can't find a trigger, but it takes longer to crash if left idle, but only minutes if you try to write or read some files. The longest I've kept it running was leaving it idle with 0 client mounts last night and it made it about 6 hours before core dumping.
The cluster was previously running 3.0.112 without error, and downgrading the system back to that version has, so far, fixed my problem. There appears to be something introduced in the code between these versions that doesn't play well with ARM's strict unaligned data access restrictions in the architecture.