possible issues with 'for-later' interrupt teardown-handling

selltc commented 8 years ago

We never call free_irq(), meaning that technically the isr function registered by request_irq can be called anytime, even possibly after we have removed the device! Yes, we clear our "device"'s interrupt-enable bit (ULTRA_CHANNEL_ENABLE_INTS) as part of visorbus_disable_channel_interrupts(), but that's just a device thing, and isn't a hard guarantee that Linux won't call our isr, especially considering we do the request_irq() with IRQF_SHARED.

For the model we are using in visorhba, where the isr just schedules a tasklet to run, I think drivers/input/keyboard/omap-keypad.c appears to be a good model to follow. Based on that, it looks like a good order of operations for teardown is:

tasklet_disable()
disable physical interrupts from your device
free_irq()
tasklet_kill()

We would need to divy that logic between visordriver_remove_device() (in visorbus), and the function driver's remove() function somehow.

I've also examined the teardown paths for the visorbus poll-simulated interrupts (that run via a kernel timer). At first I was convinced we had a race condition where if del_timer_sync() was called while the timer function was active, we would hit a problem because the timer function unconditionally does a mod_timer() to reschedule the timer at the end. It was this comment at the top of del_timer_sync() that scared me:

Callers must prevent restarting of the timer, otherwise this function is meaningless.

But I poured thru tons of source code in the kernel that uses del_timer_sync(), and none of the usages seemed concerned about this. So for now, I'm assuming this is NOT a problem for us.

selltc commented 8 years ago

I think we actually hit this problem during modprobe -r visornic. Whenever I do that (on my single-proc test box), I get a kernel panic that looks like this:

sparguest login: [  220.842792] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
IP: [<ffffffffa00ef49d>] visorbus_isr+0x3d/0x50 [visorbus]
PGD 11731067 PUD 1f12c067 PMD 0
Oops: 0000 [#1] PREEMPT SMP
Modules linked in: visorinput(C) ipv6 vfat fat acpi_cpufreq button processor evdev joydev pcspkr efivars sg sd_mod visorhba(C) scsi_mod visornic(C-) visorbus(C) ext4 jbd2 crc16 ext2 mb]
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G         C      4.6.0-ARCH+ #4
Hardware name: Dell Inc. PowerEdge T110/ , BIOS 1.23 12/15/2009
task: ffffffff8180b500 ti: ffffffff81800000 task.ti: ffffffff81800000
RIP: 0010:[<ffffffffa00ef49d>]  [<ffffffffa00ef49d>] visorbus_isr+0x3d/0x50 [visorbus]
RSP: 0018:ffff88001b003eb8  EFLAGS: 00010087
RAX: 0000000000000000 RBX: ffff88001d0c7000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff88001b003e18 RDI: ffff88001d0c7000
RBP: ffff88001b003ec8 R08: 0000000000000008 R09: ffff88001d0c6c00
R10: 0000000000000000 R11: ffff88001b015e88 R12: 0000000000000000
R13: 000000000000001a R14: 0000000000000001 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff88001b000000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000038 CR3: 000000001d9bc000 CR4: 00000000000006f0
Stack:
 ffff880012f2ef00 0000000000000001 ffff88001b003f10 ffffffff810ac4f1
 ffff880012dc7c00 0000008000000009 ffff880012dc7c00 ffff880012dc7c9c
 ffffffff81890ec0 0000000000000061 0000000000000000 ffff88001b003f38
Call Trace:
 <IRQ>
 [<ffffffff810ac4f1>] handle_irq_event_percpu+0x41/0x200
 [<ffffffff810ac6e9>] handle_irq_event+0x39/0x60
 [<ffffffff810af8df>] handle_fasteoi_irq+0x8f/0x160
 [<ffffffff8101eccd>] handle_irq+0x1d/0x30
 [<ffffffff8147503b>] do_IRQ+0x4b/0xd0
 [<ffffffff8147353f>] common_interrupt+0x7f/0x7f
 <EOI>
 [<ffffffff81026086>] ? default_idle+0x26/0x100
 [<ffffffff8102691f>] arch_cpu_idle+0xf/0x20
 [<ffffffff8109802a>] default_idle_call+0x2a/0x30
 [<ffffffff810982f4>] cpu_startup_entry+0x2c4/0x320
 [<ffffffff81467def>] rest_init+0x7f/0x90
 [<ffffffff818b2ea3>] start_kernel+0x3fa/0x407
 [<ffffffff818b2434>] x86_64_start_reservations+0x2f/0x31
 [<ffffffff818b2520>] x86_64_start_kernel+0xea/0xed
Code: 00 48 89 f3 8b b6 b8 03 00 00 48 8b 3b 41 bc 00 00 00 00 48 8d 42 b0 48 85 d2 ba 01 00 00 00 4c 0f 45 e0 e8 16 2e 00 00 48 89 df <41> ff 54 24 38 5b b8 01 00 00 00 41 5c 5d c3 0f
RIP  [<ffffffffa00ef49d>] visorbus_isr+0x3d/0x50 [visorbus]
 RSP <ffff88001b003eb8>
CR2: 0000000000000038
---[ end trace 953721bd0fe32a4a ]---
Kernel panic - not syncing: Fatal exception in interrupt
Kernel Offset: disabled
---[ end Kernel panic - not syncing: Fatal exception in interrupt

selltc commented 8 years ago

I'm starting to lay groundwork for this in for-later-proposed-selltc-20160526 with commit 1f3ec1e38975459fb65585c6b5f7da34a79a7345 commit 0e55903ac45fd0951e43786834c5e8f6daa8c0a3 commit 690f474ade34e133ea32b0fca5465924416084f2 commit e04633d961b58f1f203d2df0c03cd2b190880a34

but I have much more to go.

selltc commented 8 years ago

Now on branch githubissue#70, which is based on for-later-proposed-selltc-20160526 up-to-and-including commit 90d56dceff66e6c61ce2cc9ed2b3b5071493927e. I made a false-start with a few of those commits referenced in my previous comment.

selltc commented 8 years ago

This issue is now fixed, by the following commits on my githubissue#70 branch:

commit f8305a2fca8e7d0a6e3a647a770db63bce21e3e0 - staging: unisys: include: re-order visor_device fields for alignment
commit 44b4d28c6c1da5dd443bf8c2f457ab8dbce92359 - staging: unisys: visorbus: fix a few message typos
commit 0dba64ff5d4a58f96098e3651b731a5d1fe05960 - staging: unisys: visorbus: change how we know when we are in irq mode
commit b2489c9788d0f2052b0c3a5fc2945c5299efaf84 - staging: unisys: visorbus: ensure free_irq() is called when appropriate

Those patches pass checkpatch and testing without problems.

davidker commented 8 years ago

Thanks Tim,

I will look at this this weekend. I might want to squash some of the patches together with the original implementation of the irqs in for-later. Thanks for finding this!

selltc commented 8 years ago

Yep; that's what I figured you would do; just makes sense. Thanks David.

selltc commented 8 years ago

Sorry; I just added one patch that I forgot earlier: commit 5c83d6079af926b8f3ae22aeadc7094979050e9d - staging: unisys: visorbus: add missing acpi_unregister_gsi() calls

selltc commented 8 years ago

All 5 commits referenced above have been squashed in with the original for-later patch that first enabled the real interrepts (in my for-later-selltc-20160613 branch, which I will push soon):

commit d8dd86eb6ae0b28c2b11c18c34d7a8553e961597 - staging: unisys: Capture data from device create to register interrupt.

selltc commented 8 years ago

Just verified that all of this stuff is indeed in the latest version of the 'for-later' branch.

selltc commented 8 years ago

Hmmm... actually, I found a problem while testing Trac-7233, as I described in comment 6 of Trac-7233. I see what introduced the "new" bug.

Commit 5c83d6079af926b8f3ae22aeadc7094979050e9d (staging: unisys: visorbus: add missing acpi_unregister_gsi() calls) has a hunk that is SUPPOSED to look like this:

--- a/drivers/staging/unisys/visorbus/visorchipset.c
+++ b/drivers/staging/unisys/visorbus/visorchipset.c
@@ -1008,80 +1008,84 @@ device_epilog(struct visor_device *dev_info,
                       sizeof(struct controlvm_message_header));
                dev_info->pending_msg_hdr = pmsg_hdr;
        }

        if (response >= 0) {
                switch (cmd) {
                case CONTROLVM_DEVICE_CREATE:
                        if (notifiers->device_create) {
                                (*notifiers->device_create) (dev_info);
                                goto out_unlock;
                        }
                        break;
                case CONTROLVM_DEVICE_CHANGESTATE:
                        /* ServerReady / ServerRunning / SegmentStateRunning */
                        if (state.alive == segment_state_running.alive &&
                            state.operating ==
                                segment_state_running.operating) {
                                if (notifiers->device_resume) {
                                        (*notifiers->device_resume) (dev_info);
                                        goto out_unlock;
                                }
                        }
                        /* ServerNotReady / ServerLost / SegmentStateStandby */
                        else if (state.alive == segment_state_standby.alive &&
                                 state.operating ==
                                 segment_state_standby.operating) {
                                /* technically this is standby case
                                 * where server is lost
                                 */
                                if (notifiers->device_pause) {
                                        (*notifiers->device_pause) (dev_info);
                                        goto out_unlock;
                                }
                        }
                        break;
                case CONTROLVM_DEVICE_DESTROY:
                        if (notifiers->device_destroy) {
                                (*notifiers->device_destroy) (dev_info);
                                goto out_unlock;
                        }
+                       if ((dev_info->gsi_vector > 0) &&
+                           (dev_info->irq > 0)) {
+                               acpi_unregister_gsi(dev_info->gsi_vector);
+                       }
                        break;
                }
        }

 out_respond_and_unlock:
        device_responder(cmd, pmsg_hdr, response);

 out_unlock:
        up(&notifier_lock);
 }

but in the new 'for-later' branch, the lines were introduced in the wrong case, in commit 4502ac8c554f8cea67b7f982388a726ce6a99e61 (staging: unisys: Capture data from device create to register interrupt):

--- a/drivers/staging/unisys/visorbus/visorchipset.c
+++ b/drivers/staging/unisys/visorbus/visorchipset.c
@@ -927,80 +927,84 @@ device_epilog(struct visor_device *dev_info,
                /* only non-NULL if dev is still waiting on a response */
                response = -CONTROLVM_RESP_ERROR_MESSAGE_ID_INVALID_FOR_CLIENT;
                pmsg_hdr = dev_info->pending_msg_hdr;
                goto out_respond;
        }

        if (need_response) {
                pmsg_hdr = kzalloc(sizeof(*pmsg_hdr), GFP_KERNEL);
                if (!pmsg_hdr) {
                        response = -CONTROLVM_RESP_ERROR_KMALLOC_FAILED;
                        goto out_respond;
                }

                memcpy(pmsg_hdr, msg_hdr,
                       sizeof(struct controlvm_message_header));
                dev_info->pending_msg_hdr = pmsg_hdr;
        }

        if (response >= 0) {
                switch (cmd) {
                case CONTROLVM_DEVICE_CREATE:
                        chipset_device_create(dev_info);
                        break;
                case CONTROLVM_DEVICE_CHANGESTATE:
                        /* ServerReady / ServerRunning / SegmentStateRunning */
                        if (state.alive == segment_state_running.alive &&
                            state.operating ==
                                segment_state_running.operating) {
                                chipset_device_resume(dev_info);
                        }
                        /* ServerNotReady / ServerLost / SegmentStateStandby */
                        else if (state.alive == segment_state_standby.alive &&
                                 state.operating ==
                                 segment_state_standby.operating) {
                                /*
                                 * technically this is standby case
                                 * where server is lost
                                 */
                                chipset_device_pause(dev_info);
                        }
+                       if ((dev_info->gsi_vector > 0) &&
+                           (dev_info->irq > 0)) {
+                               acpi_unregister_gsi(dev_info->gsi_vector);
+                       }
                        break;
                case CONTROLVM_DEVICE_DESTROY:
                        chipset_device_destroy(dev_info);
                        break;
                }
        }

 out_respond:
        device_responder(cmd, pmsg_hdr, response);
 }

selltc commented 8 years ago

Unfortunately, while looking at this, I found another bug, so I created github #89.

davidker commented 8 years ago

Tim, can you rebase this branch off of the latest for-later?

selltc commented 8 years ago

I don't understand. The commits on the branches I reference above are already in the latest for-later. All of the code in for-later looks fine, EXCEPT FOR the bit above in visorchipset.set, which I just checked is still wrong.

davidker commented 8 years ago

Okay, David Binder has patches .. that I plan on dumping into for-later. At that point I want to close this one and I want to close his as well.. and then I will push both of these to Done. We still have work to squash for-later so it is ready for submittal.

selltc commented 7 years ago

@davidker, I just now verified that all of the patches referenced above have been committed to our for-later branch. Is there any reason for this issue to stay open?

davidker commented 7 years ago

No, probably not, we can close it.

davidker / unisys

possible issues with 'for-later' interrupt teardown-handling #70