markh794 / mhvtl

Linux based Virtual Tape Library
http://sites.google.com/site/linuxvtl2/
Other
137 stars 65 forks source link

Linux 4.15 new timer_setup API breaks mhvtl kernel module compilation #16

Closed beer4duke closed 2 years ago

beer4duke commented 6 years ago

Linux 4.15 introduced a new timer_setup API and removed init_timer. The current mhvtl kernel module needs to be modified to be compatible with these API changes.

This is affecting fedora 27 and other recent distributions.

markh794 commented 6 years ago

Thanks for the bug report.

FWIW: I won’t be in a position to look at this until early next week.

Sent from my autocorrecting iPhone

On 24 Apr 2018, at 23:53, beer4duke notifications@github.com wrote:

Linux 4.15 introduced a new timer_setup API and removed init_timer. The current mhvtl kernel module needs to be modified to be compatible with these API changes.

This is affecting fedora 27 and other recent distributions.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

beer4duke commented 6 years ago

I wrote a quick patch for kernel/mhvtl.c to deal with 4.15 timers, it looks timer_intr_handler can be greatly reduced as vtl_queued_cmd can be directly retrieved from cmnd_timer using from_timer. I need to test this and add all the #if for the kernel version but I should have something decent by the end of this week.

markh794 commented 6 years ago

Very cool.. Thanks for the effort :)

Sent from my iPad

On Apr 25, 2018, at 09:10, beer4duke notifications@github.com wrote:

I wrote a quick patch for kernel/mhvtl.c to deal with 4.15 timers, it looks timer_intr_handler can be greatly reduced as vtl_queued_cmd can be directly retrieved from cmnd_timer using from_timer. I need to test this and add all the #if for the kernel version but I should have something decent by the end of this week.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

beer4duke commented 6 years ago

I pushed the pull request, tried to compile against 3.xx RHEL 7 kernel and FC 27 4.15.xx with no problem. This pull request is ased on top of Martin Wilck pulL request (+ a small fix because of a small regression on earlier kernels). Hope it helps and thanks you for mhvtl, it has been helping the tape developers for many years.

scsirob commented 6 years ago

Got a kernel oops on first access to an emulated library and then a kernel panic on second access to same library. This on CentOS 7 with kernel 4.16-5 and these patches. Below is the oops, hope it helps pinpointing the issue. Kernel code debugging is beyond me, but let me know how I can help.

:[ 101.193869] ------------[ cut here ]------------ :[ 101.193871] Bad or missing usercopy whitelist? Kernel memory overwrite attempt detected to SLUB object 'dma-kmalloc-512' (offset 0, size 32)! :[ 101.193877] WARNING: CPU: 0 PID: 3092 at mm/usercopy.c:81 usercopy_warn+0x8e/0xb0 :[ 101.193878] Modules linked in: ch(+) osst st mhvtl(O) ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter intel_pmc_core intel_powerclamp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc snd_intel8x0 snd_ac97_codec ac97_bus snd_seq snd_seq_device aesni_intel crypto_simd glue_helper cryptd intel_rapl_perf snd_pcm sg input_leds pcspkr video snd_timer snd soundcore i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c :[ 101.193918] sd_mod sr_mod cdrom ata_generic pata_acpi crc32c_intel ahci libahci ata_piix e1000 libata serio_raw dm_mirror dm_region_hash dm_log dm_mod dax :[ 101.193926] CPU: 0 PID: 3092 Comm: vtllibrary Tainted: G O 4.16.5-1.el7.elrepo.x86_64 #1 :[ 101.193926] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 :[ 101.193927] RIP: 0010:usercopy_warn+0x8e/0xb0 :[ 101.193928] RSP: 0018:ffffc90000957cd8 EFLAGS: 00010286 :[ 101.193929] RAX: 0000000000000000 RBX: ffffffff8208cfa0 RCX: 0000000000000006 :[ 101.193930] RDX: 0000000000000000 RSI: 0000000000000082 RDI: ffff88007fc169d0 :[ 101.193931] RBP: ffffc90000957cf8 R08: 0000000000000000 R09: 0000000000000242 :[ 101.193931] R10: 0000000000000004 R11: 0000000000000241 R12: 0000000000000020 :[ 101.193932] R13: ffff880000098220 R14: 0000000000000000 R15: 0000000000000020 :[ 101.193933] FS: 00007f3e06961740(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000 :[ 101.193934] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 :[ 101.193935] CR2: 00007fc4fddf1038 CR3: 000000007af40005 CR4: 00000000000606f0 :[ 101.193937] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 :[ 101.193938] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 :[ 101.193939] Call Trace: :[ 101.193942] check_heap_object+0xc4/0x130 :[ 101.193943] check_object_size+0xdb/0x1b0 :[ 101.193945] vtl_sg_copy_user+0xed/0x180 [mhvtl] :[ 101.193947] vtl_c_ioctl+0x2e3/0x7cc [mhvtl] :[ 101.193949] do_vfs_ioctl+0xaa/0x610 :[ 101.193950] SyS_ioctl+0x79/0x90 :[ 101.193952] do_syscall_64+0x79/0x1b0 :[ 101.193954] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 :[ 101.193955] RIP: 0033:0x7f3e0625d107 :[ 101.193956] RSP: 002b:00007ffdf807c058 EFLAGS: 00000206 ORIG_RAX: 0000000000000010 :[ 101.193957] RAX: ffffffffffffffda RBX: 00007ffdf807c110 RCX: 00007f3e0625d107 :[ 101.193958] RDX: 00007ffdf807c110 RSI: 0000000000000203 RDI: 0000000000000003 :[ 101.193958] RBP: 00000000000000fe R08: 000000000060fe98 R09: 0000000000000000 :[ 101.193959] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000100000 :[ 101.193960] R13: 0000000000000000 R14: 00007ffdf807c2f0 R15: 000000000060fe98 :[ 101.193960] Code: 07 82 48 0f 45 f2 4c 89 44 24 10 48 89 c2 48 89 4c 24 08 48 89 1c 24 4d 89 d8 4c 89 d1 48 c7 c7 08 d0 08 82 31 c0 e8 82 3a e2 ff <0f> 0b 48 83 c4 18 5b 5d c3 49 c7 c1 91 cc 09 82 4c 89 cb 4d 89 :[ 101.193982] ---[ end trace 73f9499224a00a1b ]---

scsirob commented 6 years ago

A quick observation: an oops already occurs when the module loads. When accessing the library through /dev/sg things appear to work most of the time. When accessing the same device through /dev/sch, the kernel crashes after one or two commands

beer4duke commented 6 years ago

4.16 may be another beast... 4.14 and 4.15 introduced some changes to some critical kernel APIs. I did not have a look at the changes introduced by 4.16 kernel, but it looks like this is another issue. If you can have a go at a 4.15 kernel from CentOS 7 Elrepo with these patches that would be perfect. For example you can get kernel-ml-4.15.15-1.el7.elrepo.x86_64.rpm from this mirror.

scsirob commented 6 years ago

Looks like there are two separate issues. I installed 4.15 from the mirror and this does not produce the usercopy whitelist oops that I saw above. So that must be 4.16-related. Reading up on the kernel mailing list, 4.16 did make major changes to usercopy handling to harden the kernel from memory leaks. Mhvtl appears to trip on that.

However, on 4.15 there's still a problem. When using the exposed /dev/sg device, the library works as advertized. No errors, no warnings. But when accessing the exposed /dev/sch device, I get a nasty kernel panic, and it hangs the system. Weird though, this happens on the second time I access the /dev/sch device. The first time there's no errors or warnings. This issue happened on 4.16 as well k4 15_panic

scsirob commented 6 years ago

The crash is really easy to reproduce. Just open() and close() the /dev/sch0 device twice, no need for any I/O. The kernel panics. Here's a quick and dirty reproduce tool:

#include <stdio.h>
#include <fcntl.h>
#include <errno.h>
#include <string.h>
#include <unistd.h>

int main(int argc, char **argv) {
    int fp;
    if(argc<2) {
        printf("Need one argument\n");
        return 1;
    }
    if(-1 == (fp = open(argv[1], O_RDWR | O_NDELAY))) {
        printf("Cannot open %s, %s\n", argv[1], strerror(errno));
        return 2;
    }
    printf("Open %s OK, handle = %d\n", argv[1], fp);
    close(fp);
    printf("Closed, proceed to second open\n");
    if(-1 == (fp = open(argv[1], O_RDWR | O_NDELAY))) {
        printf("Cannot open %s, %s\n", argv[1], strerror(errno));
        return 3;
    }
    printf("Open %s OK, handle = %d\n", argv[1], fp);
    close(fp);
    return 0;
}
hrchu commented 6 years ago

reporduce result:

 ./a.out /dev/sch0
Open /dev/sch0 OK, handle = 3
Closed, proceed to second open
Killed

@scsirob just for curious, does there any scenario that /dev/schX is necessary instead of /dev/sgX?

scsirob commented 6 years ago

@hrchu There's no specific scenario that necessitates /dev/sch but Linux generates the device at registration, just like it registers /dev/st devices for vtltape instances. Having a device node that can hard crash the kernel is not good.

I've been playing with a different backend (SCST) lately, it does not exhibit the same problem.

hrchu commented 6 years ago

@scsirob can we use SCST with mhvtl to fix the problem?

scsirob commented 6 years ago

@hrchu Not without major changes to mhVTL user code. SCST requires a different style of memory management, it allocates and frees the data buffers dynamically. The mhvtl code would have to follow those allocations and use the buffers that SCST assigns, rather than being 'master' of its own data buffers. Can it be done? Sure. But it's quite a bit of work. My own project can still be compiled to work either with the scst_user kernel driver, or on top of the mhVTL kernel driver.

The documentation for SCST is a bit cryptic and does not describe in detail what you should and shouldn't do. To start coding against SCST there's a choise between a one-pager telling you which steps to take (but without any code), or a functional file_io example which uses every single option that SCST has to offer and therefor is way too complex to understand. I started by builing my own 'minimal implementation' and worked up from there.

For reference I am sharing the scst_user sample code I wrote. It's ugly, it's very basic, but it works. Perhaps it gets Mark triggered, or one of the other contributors.. testscst.zip

Note that SCST is designed to work as a true target, so it can offer iSCSI, FC and Infiniband to other machines. But it does not add a device on the local machine, so if you need that, you need to add support for scst_local, yet another SCST module. That's what I did in my project.

gonzoleeman commented 6 years ago

I am working on connecting mhvtl to tcmu-runner.

-- Lee-Man Duncan Sent from my iPhone, dude

On Aug 15, 2018, at 11:11 PM, scsirob notifications@github.com wrote:

@hrchu Not without major changes to mhVTL user code. SCST requires a different style of memory management, it allocates and frees the data buffers dynamically. The mhvtl code would have to follow those allocations and use the buffers that SCST assigns, rather than being 'master' of its own data buffers. Can it be done? Sure. But it's quite a bit of work. My own project can still be compiled to work either with the scst_user kernel driver, or on top of the mhVTL kernel driver.

The documentation for SCST is a bit cryptic and does not describe in detail what you should and shouldn't do. To start coding against SCST there's a choise between a one-pager telling you which steps to take (but without any code), or a functional file_io example which uses every single option that SCST has to offer and therefor is way too complex to understand. I started by builing my own 'minimal implementation' and worked up from there.

For reference I am sharing the scst_user sample code I wrote. It's ugly, it's very basic, but it works. Perhaps it gets Mark triggered, or one of the other contributors.. testscst.zip

Note that SCST is designed to work as a true target, so it can offer iSCSI, FC and Infiniband to other machines. But it does not add a device on the local machine, so if you need that, you need to add support for scst_local, yet another SCST module. That's what I did in my project.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

scsirob commented 4 years ago

@hrchu A bit late to follow-up, but the 'open /dev/sch0 twice' crash has been solved! It turns out that this has nothing to do with mhvtl, it's a genuine Linux kernel bug. A fix has just been accepted and committed for mainstream kernels 4.14, 4.19 and 5.3.

beer4duke commented 4 years ago

I was looking for the diff but only found this announcement on LKML:

Bart Van Assche (1):
      scsi: ch: Make it possible to open a ch device multiple times again
markh794 commented 4 years ago

Many thanks for the pointer.. So, it will be a 5.4+ kernel solution.

On 30 Oct 2019, at 10:00 am, Julien Leduc notifications@github.com wrote:

I was looking for the diff but only found this announcement on LKML https://lkml.org/lkml/2019/10/27/179:

Bart Van Assche (1): scsi: ch: Make it possible to open a ch device multiple times again — You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/markh794/mhvtl/issues/16?email_source=notifications&email_token=AACGL4XC3ITEGJCOML6A2S3QRC6CHA5CNFSM4E4G4VVKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECSLWGQ#issuecomment-547666714, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACGL4QVRSUYPSDEU4HHO63QRC6CHANCNFSM4E4G4VVA.

scsirob commented 4 years ago

That is the correct submit. I found the issue, worked it out with the SCST maintainer and he wrote the single line patch. On close of the ch device a pointer was set to NULL when it shouldn't have. The NULL pointer caused the crash on the subsequent open.

It's actually backported to 4.14, 4.19 and 5.3, got those announcements from Greg KH last sunday.