kernelOfTruth / ZFS-for-SystemRescueCD

SRM kernel modules for SystemRescueCD (Gentoo Linux based) releases, allowing access to zpools running on latest upstream ZFSonLinux code
18 stars 5 forks source link

zol 0.6.4 modules #3

Closed SenH closed 5 years ago

SenH commented 9 years ago

Any chance they will be available? ;)

kernelOfTruth commented 9 years ago

sure thing :train2:

please give it a try: https://github.com/kernelOfTruth/ZFS-for-SystemRescueCD/tree/ZFS-for-SysRescCD-4.5.2

I've booted into SystemRescueCD 4.5.2 and the modules loaded fine,

haven't had a chance to do further tests though (since I currently need the system)

kernelOfTruth commented 9 years ago

I had put off creating those modules since 4.5.1 and earlier - so your "bump" was timed at the right moment :+1:

in case of issues - I wouldn't have been able to access my data - or only by jumping through several hoops :blush:

Any experience with it so far ?

knmonk commented 9 years ago

Hi kernelOfTruth,

first of all, a big thanks for your effort - it is very much appreciated!

The dreaded 'illegal hardware instruction' bug has returned for me (see bug #2). Here is my cpuinfo:

~# cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 60
model name      : Intel(R) Pentium(R) CPU G3250 @ 3.20GHz
stepping        : 3
microcode       : 0x19
cpu MHz         : 800.000
cache size      : 3072 KB
physical id     : 0
siblings        : 2
core id         : 0
cpu cores       : 2
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 movbe popcnt tsc_deadline_timer xsave rdrand lahf_lm abm arat xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust erms invpcid
bogomips        : 6399.96
clflush size    : 64
cache_alignment : 64
address sizes   : 39 bits physical, 48 bits virtual
power management:
hac0demon commented 9 years ago

Same issue with 'illegal hardware instruction' bug for AMD E-350

kernelOfTruth commented 9 years ago

https://github.com/kernelOfTruth/ZFS-for-SystemRescueCD/commit/1142ccfb26f6225eb5d6e1459b53f48f9b10dd81

https://github.com/kernelOfTruth/ZFS-for-SystemRescueCD/commits/ZFS-for-SysRescCD-4.5.2

updated the CFLAGS according to https://forums.gentoo.org/viewtopic-t-684905-view-next.html?sid=5be02ae9bcd3a6b6da4be7a8c0e37177

SPL & ZFS upgrade to v0.6.4+ (ZFS 22.05.2015, SPL 21.05.2015)
Those changes were necessarily due to significant stability- &
also some performance-related improvements.
NO additional patches included - all vanilla.
sys-kernel/spl at dc5e8b70416e5d511bc361309bd426c767177723
sys-fs/zfs-kmod & sys-fs/zfs at 65037d9b25c2bfa98d0aa5c9e34678127c03b345
CFLAGS changed to CFLAGS: -march=x86-64 -mtune=generic

updated README, CHECKSUMS and CHANGES file.

Hopefully it works now :+1:

Only compiled - haven't had a chance to boot into it with those new modules, currently working with my system

hac0demon commented 9 years ago

Same 'illegal hardware instruction' on attempt to do 'zpool upgrade -v' for V10. Similar dmesg output as in issue #2 , md5 match. Old nocona flags should be fine for E-350, so seems issue is in not getting proper CFLAGS set during compilation.

knmonk commented 9 years ago

Same with me - error persists on Pentium G3250

kernelOfTruth commented 9 years ago

Okay, if that doesn't work now - I'm close to out of ideas

https://github.com/kernelOfTruth/ZFS-for-SystemRescueCD/commit/b6193c4e39b1a9413034b99d2db692f684d6dba6 https://github.com/kernelOfTruth/ZFS-for-SystemRescueCD/commits/ZFS-for-SysRescCD-4.5.2

Had to comment out quite some CFLAGS & CXXFLAGS in /etc/portage/make.conf since libtool (or gcc ?) still appears to follow those despite separate flags are being set in /etc/portage/env

Now using the GCC Vanilla specs (instead of the full hardened which seemed to work fine in the past if I remember correctly)

x86_64-pc-linux-gnu-4.8.4-vanilla
v11
    ----------------------------------------
    Rebuild, modified /etc/portage/make.conf CFLAGS & CXXFLAGS
    Portage (libtool) seems to use those, GCC 4.8.4 Vanilla specs

    updated CHECKSUMS and CHANGES file.
knmonk commented 9 years ago

Problem persists

[   34.828363] SPL: Loaded module v0.6.4-1
[   34.885746] ZFS: Loaded module v0.6.4-1, ZFS pool version 5000, ZFS filesystem version 5
[   63.502026] traps: zpool[2713] trap invalid opcode ip:7fa1ed635fc2 sp:7ffd1a843f80 error:0 in libc.so.6[7fa1ed5ec000+1a0000] 
SenH commented 9 years ago

@kernelOfTruth Sorry for late reply. The v11 modules are not working on a HP Microserver N40L (std kernel) Samer error as @knmonk when doing zpool upgrade -v

trap invalid opcode ip:7fa1ed635fc2 sp:7ffd1a843f80 error:0 in libc.so.6[7fa1ed5ec000+1a0000]

Fortunately, my pool is not upgraded to 0.6.4, so I'm still able to rescue with 4.3.0-r3 modules.

SenH commented 9 years ago

Could it be missing CFLAG -march=nocona? Like in #2

knmonk commented 9 years ago

@SenH: It seems that @kernelOfTruth has gone to great length to ensure that everything is built with the correct CFLAGS. @kernelOfTruth: The kernel output indicates that the invalid opcode is located in the dynamically linked libc. The libc version included in the srm (zfs-core-3.14.35-std452-amd64.srm/lib64/libc.so.6) is dated 2015-02-28 which suggests that it wasn't even built together with the remainder of the module. Is this maybe the libc from your host system just copied over?

knmonk commented 9 years ago

I got it working by replacing all files not dated 2015-05-27 in /lib64 of zfs-core-3.14.35-std452-amd64.srm with the ones from _sysresccd-4.2.0_zfs0.6.3.iso. While I had no errors so far, I still hesitate to update my pools ...

kernelOfTruth commented 9 years ago

@SenH I could try that again - I however must remark that between the last release (4.3.0*) and this one I changed glibc and switched to gcc 5.1 (part of the system already is compiled with it)

So that might account for the incompatibility and/or issues

I'm compiling cxx stuff with the following flags: -D_GLIBCXX_USE_CXX11_ABI=0 so I'm not entirely sure if that's the cause

@knmonk awesome ! yes, I can understand that hesitation, it's supposed to be a stable filesystem and stable rescue environment after all

could you please post which files exactly you replaced ?

Meanwhile I contemplated that the only surefire way to do it would be to extract the liveCD contents and then chroot into it - well, it's some effort but at least it would work reliably, right ?

This needs some preparation and I'll see to it to get things right,

Thoughts ?

I'm however mainly occupied with study-related tasks, so this will take some time.

Recently there appear to have been some changes behind the scenes of the Sabayon Live discs related to ZFS support so if you need 0.6.4 you can try those out:

e.g. http://ftp.nluug.nl/os/Linux/distr/sabayonlinux/iso/monthly/

(15.06)

sys-fs/zfs-0.6.4
sys-fs/zfs-kmod-0.6.4#4.0.0-sabayon

(the ebuild from sabayon doesn't utilize sys-kernel/spl)

knmonk commented 9 years ago

Thanks for the pointers to sabayon, I'll give it a try!

When I first replaced libc only I got errors from libpthread so I decided to replace every library in /lib64 not compiled on 2015-05-27 which is the build time of all the zfs related stuff. Here is the detailed list:

ld-linux-x86-64.so.2 
libblkid.so.1 
libc.so.6 
libdl.so.2 
libm.so.6 
libpthread.so.0 
librt.so.1 
libuuid.so.1 
libz.so.1

chrooting to the liveCD would be perfect if you can get the kernel compiled in this environment. I very much appreceate your efforts. Take your time - this is not urgent (at least for me as my pools are still on 0.6.3).