visorbus: periodic_work gripes

selltc commented 8 years ago

Source: Thomas Gleixner tglx@linutronix.de 5/18/2016:

That set of functions is obscure. Especially visor_periodic_work_stop() makes me shudder. See also #57. That work->lock does not inspire my confidence further.

Source: Thomas Gleixner tglx@linutronix.de 5/18/2016:

Why is this a rw_lock if it's only locked with write_lock? And what's the purpose of this lock at all?

See KanBoard-1250.

selltc commented 8 years ago

Note this also includes KanBoard-1248 / github #56, since it makes sense to tackle both of these at the same time.

selltc commented 8 years ago

As per suggestion by @davidker, I've been poking around at tasklets as a possible alternative to workqueues in periodic_work.c. But because there is nothing inherently periodic about a tasklet, I would need to wrap it with a kernel timer anyway. So I'm thinking that maybe I can just get by with a kernel timer with some simple logic around it.

Although the current periodic_work callback functions are assumed to be able to block/sleep (because the callbacks are protected by the semaphore (i.e., can block) visordriver_callback_lock), in fact none of them ever do. So that means they are candidates from calling via a timer callback function.

I am keeping an eye on both upstream-next and for-later as I work on these changes, because there are significant differences regarding the usages of periodic_work between those branches. There are very few real uses of periodic_work anymore (outside of visorinput) in staging-next. visorhba uses a thread (and has channel_interrupt = NULL), and visornic using NAPI polling (and also has channel_interrupt = NULL). But in for-later, we use channel_interrupt whenever we don't have an irq.

selltc commented 8 years ago

The patches on the branch upstream-next-selltc-20160519: commit 74f7ab9e99e70a1c6f48209556779bfd3f453caf, commit 1af5e9719bc18a9a7e68563e992d57af8ebec2c0, and commit 2effeaf691c89d35bdc8a497495c8c529c102e9e seem to work fine as based on upstream-next. I tested all of the drivers, and verified visorbus to be checkpatch-clean.

I next intend to try them out on a branch based on the for-later branch, as I know there are going to be a few changes.

The only thing I might need to think about a little harder is whether I need a spinlock in that new code anywhere.

davidker commented 8 years ago

Tim, don't base it on for-later copy for-later over to a new branch and then rebase it off your changes from upstream-next. If that works you are golden :) If you want I can do that for you.

selltc commented 8 years ago

I was going to copy for-later as for-later-selltc-20160520, then cherry-pick each of the 3 patches, making sure everything worked after each patch. Is that equivalent to what you are suggesting?

davidker commented 8 years ago

I was thinking you copy for-later as for-later-selltc-20160520 and then set its upstream to upstream-next-selltc-20160520(assuming name for your changes based on upstream next). It will complain that they are not synced with each other at which point you do a git pull --rebase placing the changes before the for-later branch changes.

selltc commented 8 years ago

Ok, I'll give that a shot.

davidker commented 8 years ago

The biggest gotcha is it might complain about patches not applying cleanly but that will be what we need to fix anyways so if you can clean up the conflicts great. If you have questions about conflicts let me know. And if it gets too daunting don't feel bad asking for extra eyes ;)

selltc commented 8 years ago

But if I screw up (which is about 95% certain), if I have for-later-selltc-20160520 checked out at the time, there is NO WAY I can screw up the real for-later branch, right?

davidker commented 8 years ago

The real for-later sits on my master repo. The clone of that last what I think is good sits on github. Your copy should be safe if you are working in another branch. if not, feel free to delete your local for-later and re-fetch the origin/for-later.

davidker commented 8 years ago

If you just want to post your changes for inspection that is part of my inspection process making sure it doesn't totally hose the for-later branch and I need to kick it back for help. Whatever you end up doing, I'll probably have to do something similar when I look at the patches because they will have to go in before we get out of staging.

selltc commented 8 years ago

Yeah, I'd like to try this for myself first, because I want to test the logic in for-later as well.

davidker commented 8 years ago

Yeah.. with the stuff that @TurningWrenches is working on I just lost a patch in for-later but I'm not sure where it went cause it definitely got applied.

selltc commented 8 years ago

Ok, I rebased my changes into a for-later based branch named for-later-selltc-20160520 (commit 74f7ab9e99e70a1c6f48209556779bfd3f453caf, commit 1af5e9719bc18a9a7e68563e992d57af8ebec2c0, commit 2effeaf691c89d35bdc8a497495c8c529c102e9e). But I'm having some logistical problems this morning, because I can't even get a vanilla for-later branch to work anymore. No interrupts. I can't figure what I'm doing wrong. I get the same behavior from for-later-selltc-20160520. I'll get back to this later.

davidker commented 8 years ago

I'll test for-later vanilla on a machine this morning and see what I discover. Thanks!

selltc commented 8 years ago

But here's why I think it MUST be something stupid I'm doing: I checked out for-later from git, then diffed our source files against the one's I have committed into subversion (which I KNOW I tested successfully), and they were exactly identical.

selltc commented 8 years ago

I know the code is successfully registering for interrupts, I'm just not getting any.

[ramfs /]# dmesg | grep IRQ
[    0.000000] ACPI: IRQ9 used by override.
[    0.000000] NR_IRQS:4352 nr_irqs:256 16
[    0.479933] PCI: Using ACPI for IRQ routing
[   19.361040] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[   23.031196] visornic vbus2:dev0: IRQ=26 registered
[   23.041645] visornic vbus2:dev1: IRQ=26 registered
[   23.053321] visorhba vbus2:dev2: IRQ=27 registered

[ramfs /]# cat /proc/interrupts | grep vbus
 26:          0   IO-APIC  22-fasteoi   vbus2:dev0, vbus2:dev1
 27:          0   IO-APIC  23-fasteoi   vbus2:dev2

davidker commented 8 years ago

Have we looked a the IO partition to see if we have enabled interrupts, I'm wondering if something got blown away in that case.

davidker commented 8 years ago

I'm seeing interrupts on R730-1.

selltc commented 8 years ago

Like I said, I'm having some sort of local problem here. Were you using my branch, or for-later?

The iovm Features looks ok to me, but NO interrupts have been sent. I'm going to take a break and come back to this later.

===== /proc/uissd/vhba/vbus3_dev2/info =====
VHba[2]: WWNN:3020100:0 chanptr:ffffc90004840000
     SrvState: 1, CliStateOS: 0, CliStateBoot: 1
     state:READY(1), usage:1.
     commands received:803, responses sent:803.
     interrupts received:0, interrupts sent:0.
     max queued ios:1.
     queue_empty_cnt:2118948.
     total_wakeup_cnt:2118948.
     non_empty_wakeup_cnt:803.
     Features: 0x37
     SCSI Host Max Counts[max_channel=1,max_id=8,max_lun=16,cmd_per_lun=1,max_io_size=258048].
     cmdQ NumSignalsReceived :803, NumInterruptsReceived :0, NumSignalsSent:803.
     cmdQ NumOverflows :0, MaxSignalSlots:65.
     cmdQ Head: 0x17, Tail: 0x17
     cndQ FeatureFlags 0x0,
     rspQ NumSignalsReceived :799, NumInterruptsReceived :0
     rspQ NumSignalsSent :803, NumEmptyCnt:0.
     rspQ NumOverflows :0, MaxSignalSlots:65.
     rspQ Head: 0x17, Tail: 0x13
     rspQ FeatureFlags 0x1,
     vhba ios_outstanding:0.
     commands not forwarded:65.
     useG2GCopy:0.
     commands enqueued:803.
     commands dequeued:803.
     vhba sendInterruptHandle:0x3002f
     vhba recvInterruptHandle:0x2f
     vhba recvInterruptVector:0x0
     vhba recvInterruptShared:0x0
     sent_extra_ints_cnt= 0
     used_g2g_copy_cnt= 735, NOT_used_g2g_copy_cnt= 0
     max io size rcvd:131072.
     max_pages_needed:32.
     pages_needed_cnt[1]:65.
     pages_needed_cnt[4]:18.
     pages_needed_cnt[8]:2.
     pages_needed_cnt[16]:626.
     pages_needed_cnt[32]:23.
     max_local_sg_pagecount:0.
     taskmgmt error_count:3.
     VDisk[0]:0:1:1 <<spar1_lin                       >>.
                 pdisk:0xffff880001fbccc0.
                 state:READY(1), usage:1.
                 vdisk ios_outstanding:0, max_ios_outstanding:1.
                 error_count:0.
                 useG2GCopy:1.

#define ULTRA_IO_DRIVER_ENABLES_INTS (0x1ULL << 1)
#define ULTRA_IO_CHANNEL_IS_POLLING (0x1ULL << 3)
#define ULTRA_IO_IOVM_IS_OK_WITH_DRIVER_DISABLING_INTS (0x1ULL << 4)
#define ULTRA_IO_DRIVER_DISABLES_INTS (0x1ULL << 5)
#define ULTRA_IO_DRIVER_SUPPORTS_ENHANCED_RCVBUF_CHECKING (0x1ULL << 6)

davidker commented 8 years ago

I was using for-later. I'm going to grab your branch and start doing maintainer stuff with it and see what comes of things.

selltc commented 8 years ago

Interesting... I just did a "make distclean" in my kernel tree and re-built everything, and now for-later works perfectly.

for-later-selltc-20160520 still has a problem with visorinput. But visorbus, visorhba, and visornic seem to work fine. I'll take that!

On to looking at visorinput...

davidker commented 8 years ago

Tim, I pulled your code into my upstream-next local branch and then applied for-later on top of that. I ran into a couple of conflicts that I need to fix up before I can test it. But it applied fairly cleanly.

selltc commented 8 years ago

Thanks, but keep in mind there is a bug involving visorinput when my stuff is on for-later. I DO plan on fixing that tomorrow, though. Just FYI (and FMI), here is the ugliness I got from visorinput when I tried to use the mouse in the GUI:

[   83.114720] kernel BUG at kernel/time/timer.c:966!
[   83.119507] invalid opcode: 0000 [#1] PREEMPT SMP
[   83.124348] Modules linked in: ipv6 vfat fat acpi_cpufreq button pcspkr efivars evdev joydev processor sg sd_mod visorhba(C) visorinput(C) scsi_mod visornic(C) visorbus(C) ext4 jbd2 crc16 ext2 mbce
[   83.142530] CPU: 0 PID: 1516 Comm: X Tainted: G         C      4.6.0-rc7-ARCH+ #1
[   83.150003] Hardware name: Dell Inc. PowerEdge T110/ , BIOS 1.23 12/15/2009
[   83.156960] task: ffff880011505c00 ti: ffff880011424000 task.ti: ffff880011424000
[   83.164436] RIP: 0010:[<ffffffff810bb1bb>]  [<ffffffff810bb1bb>] add_timer+0x1b/0x20
[   83.172193] RSP: 0018:ffff880011427be0  EFLAGS: 00010286
[   83.177497] RAX: 00000000fffef6fe RBX: ffff88000bfe9c00 RCX: 0000000000000000
[   83.184619] RDX: 0000000000000024 RSI: ffff88001b00cb58 RDI: ffff88000bfe9ee8
[   83.191744] RBP: ffff880011427bf0 R08: 0000000000000030 R09: 0000000000000000
[   83.198866] R10: 00000000000013c4 R11: 00000000000013c4 R12: 0000000000000000
[   83.205991] R13: ffff880011011c08 R14: ffff88001e1df218 R15: ffff880011011c80
[   83.213122] FS:  00007ff019808880(0000) GS:ffff88001b000000(0000) knlGS:0000000000000000
[   83.221201] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   83.226945] CR2: 00007ff019821000 CR3: 000000001e04d000 CR4: 00000000000006f0
[   83.234069] Stack:
[   83.236078]  ffffffffa00ef5ea ffff880000423c00 ffff880011427c08 ffffffffa0145838
[   83.243531]  ffff88001e1df000 ffff880011427c38 ffffffff81359969 ffff880011011c00
[   83.250983]  ffff88001e034a08 ffff88001101f000 ffff880011011f70 ffff880011427c80
[   83.258437] Call Trace:
[   83.260887]  [<ffffffffa00ef5ea>] ? visorbus_enable_channel_interrupts+0x4a/0x70 [visorbus]
[   83.269228]  [<ffffffffa0145838>] visorinput_open+0x28/0x80 [visorinput]
[   83.275931]  [<ffffffff81359969>] input_open_device+0x79/0xa0
[   83.281678]  [<ffffffffa018b0ba>] evdev_open+0x1aa/0x1f0 [evdev]
[   83.287686]  [<ffffffff811972df>] chrdev_open+0x9f/0x1c0
[   83.292997]  [<ffffffff81197240>] ? cdev_put+0x30/0x30
[   83.298128]  [<ffffffff811901e9>] do_dentry_open+0x1c9/0x2f0
[   83.303786]  [<ffffffff811913b8>] vfs_open+0x58/0x60
[   83.308743]  [<ffffffff811a0ac4>] path_openat+0x374/0x11d0
[   83.314220]  [<ffffffff8112ef02>] ? generic_perform_write+0x132/0x1a0
[   83.320650]  [<ffffffff811a295e>] do_filp_open+0x7e/0xe0
[   83.325962]  [<ffffffff81191cca>] ? __vfs_write+0xaa/0xe0
[   83.331352]  [<ffffffff811aee9c>] ? __alloc_fd+0xbc/0x170
[   83.336741]  [<ffffffff811916fa>] do_sys_open+0x11a/0x1f0
[   83.342143]  [<ffffffff811917ee>] SyS_open+0x1e/0x20
[   83.347109]  [<ffffffff81461a9b>] entry_SYSCALL_64_fastpath+0x13/0x8f
[   83.353545] Code: ef e9 67 ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 48 83 7f 08 00 75 0f 55 48 8b 77 10 48 89 e5 e8 d7 fd ff ff 5d c3 <0f> 0b 0f 1f 00 66 66 66 66 90 55 48 89 e5 41 57
[   83.373493] RIP  [<ffffffff810bb1bb>] add_timer+0x1b/0x20
[   83.378910]  RSP <ffff880011427be0>
[   83.384412] ---[ end trace 35345b78aa8d66fb ]---

Because it is in timer.c, I'm quite certain this is caused by something I just changed.

selltc commented 8 years ago

So regarding my "local problem" above, it turns out that it did NOT really go away. It turned out that whenever I used ANY code that required interrupts, I could only boot up successfully about 1 in every 3 tries. commit 36e54e6d60513257986b43fd78e8267df3b857d7 is what I needed to fix the problem. @davidker, I think you should just merge this into your previous for-later commit c65d2a5cb9ee71e93f56012590869334e2e4e372 (staging: unisys: set client state).

selltc commented 8 years ago

commit fb7fa961d73b32abda8f8b59f247541ab63f88e8 fixes another for-later specific bug revealed by the changes introduced to address this github issue, which should be merged with the for-later commit a5e17667ef40901f3efafd25f1d451e3f8f7d8ce. I will wrap all this up and put a bow on it next...

selltc commented 8 years ago

`upstream-next` patches for this issue

These are on my upstream-next-selltc-20160519 branch, as

commit 74f7ab9e99e70a1c6f48209556779bfd3f453caf - staging: unisys: visorbus: removed unused periodic_test_workqueue
commit 1af5e9719bc18a9a7e68563e992d57af8ebec2c0 - staging: unisys: visorbus: use kernel timer instead of workqueue
commit 2effeaf691c89d35bdc8a497495c8c529c102e9e - staging: unisys: visorbus: remove periodic_work.h/.c

`for-later` patches for this issue

Ok, everything is working now in my for-later-based branch for-later-selltc-20160522. @davidker, as you requested, this contains upstream-next on the bottom, then my 3 fix patches, then all of the other for-later patches on top. This includes:

The original patches required to address this github issue (which are also destined for staging-next):
- commit 74f7ab9e99e70a1c6f48209556779bfd3f453caf - staging: unisys: visorbus: removed unused periodic_test_workqueue
- commit 1af5e9719bc18a9a7e68563e992d57af8ebec2c0 - staging: unisys: visorbus: use kernel timer instead of workqueue
- commit 2effeaf691c89d35bdc8a497495c8c529c102e9e - staging: unisys: visorbus: remove periodic_work.h/.c
A tweaked version of commit c4500651be3e1b03fa627d46f591be590209c089 ("staging: unisys: set client state") that also includes my fix in commit 36e54e6d60513257986b43fd78e8267df3b857d7 ("staging: unisys: visorbus: fix race condition involving interrupt enable").
Tweaks to commit 0b24ff5a0636499cf3727d8862f042cfe79a42f1 ("staging: unisys: Re-enable interrupts after we have done the work") and commit 473f90cf2bb9a515b5ba4aee587ab92244af8f5c ("staging: unisys: Capture data from device create to register interrupt.") that were necessary due to changes introduced by my original 3 patches from the first bullet.
A tweak to commit 473f90cf2bb9a515b5ba4aee587ab92244af8f5c ("staging: unisys: Capture data from device create to register interrupt.") that also include my fix in commit fb7fa961d73b32abda8f8b59f247541ab63f88e8 ("staging: unisys: visorinput: fix bug revealed by commit 1af5e97").

cover letter

[PATCH 0/3] simplify visorbus by using kernel timer instead of periodic_work wq

This patch converts visorbus to use a kernel timer for periodic device-specific
callbacks instead of a workqueue, making the implementation in periodic_work.c
and periodic_work.h no longer necessary.  These files are then deleted.

selltc commented 8 years ago

@davidker pointed out a problem with my patches via email:

Are we okay with what visorinput is doing in its channel interrupt routine to run in atomic context? I thought at one time is tried to sleep in its interrupt routine cause it knew it could on a workqueue. Anyways I'm wondering if we should change it to a tasklet at some point.

Shoot... you're correct David. In for-later, it's ok because the channel_interrupt() in visorinput uses a spinlock, but in upstream-next it still uses a semaphore. Dag it. I will have to fix that. (I can't WAIT until we can just worry with ONE code stream!)

But I don't see any code in visorinput that will cause it to sleep in its interrupt routine. So I think we are ok there.

selltc commented 8 years ago

@mriswyth pointed out a problem with my patches via email:

Prior to "use kernel timer instead of workqueue" (commit 769107d9ba35ffbb3bf1200207e55010a378e2b5) I read the code as visordriver_callback_lock was NEEDED to ensure that drv->probe(), drv->remove(), and drv->channel_interrupt() were never invoked concurrently. With the change in "use kernel timer instead of workqueue" I now read the code as drv->probe() or drv->remove() may run concurrently with drv->channel_interrupt(), but still not with each other. I'm wondering what in the change allows this possible concurrency to be safe now.

We NEEDED the lock before. Now we CAN'T HAVE the sleeping lock mechanism. But why don't we still NEED the lock?

I believe the answer is that we now rely on the function drivers (visorinput, visorhba, visornic) to ensure correct locking between _interrupt(), _probe(), and _remove(). If you are ever are about to enter code where you need to guarantee that you NOT running concurrently with _interrupt(), then you need to call visorbus_disable_channel_interrupts(), which will:

wait for any running instance of _interrupt() to complete, then,
prevent future instances of _interrupt() from happening This is much like "real" interrupts.

This logic seems ok for visorhba and visorinput (as I look at for-later), but may not be quite right in visornic. I'm not quite sure how to test this path adequately, so I'm a bit afraid to touch it. (I know... bad excuse.)

That being said, I believe I need to harden my "periodic_work --> kernel timer" patch a bit to protect against incorrect or redundant usages of dev_start_periodic_work() and dev_stop_periodic_work(), like this:

--- a/drivers/staging/unisys/include/visorbus.h
+++ b/drivers/staging/unisys/include/visorbus.h
@@ -159,6 +159,7 @@ struct visor_device {
        struct device device;
        struct list_head list_all;
        struct timer_list timer;
+       bool timer_active;
        bool being_removed;
        /* mutex to serialize visor_driver function callbacks */
        struct mutex visordriver_callback_lock;
--- a/drivers/staging/unisys/visorbus/visorbus_main.c
+++ b/drivers/staging/unisys/visorbus/visorbus_main.c
@@ -540,18 +540,22 @@ dev_periodic_work(unsigned long __opaque)
 static void
 dev_start_periodic_work(struct visor_device *dev)
 {
-       if (dev->being_removed)
+       if (dev->being_removed || dev->timer_active)
                return;
        /* now up by at least 2 */
        get_device(&dev->device);
        dev->timer.expires = jiffies + POLLJIFFIES_NORMALCHANNEL;
        add_timer(&dev->timer);
+       dev->timer_active = true;
 }

 static void
 dev_stop_periodic_work(struct visor_device *dev)
 {
+       if (!dev->timer_active)
+               return;
        del_timer_sync(&dev->timer);
+       dev->timer_active = false;
        put_device(&dev->device);
 }

selltc commented 8 years ago

I also just realized why I wasn't seeing kernel diagnostic messages when I attempted to call a blocking operation like down_write() from an atomic context (which I have been doing in visorinput.c due to a bug in my patch): I forgot to set this in my .config:

CONFIG_DEBUG_ATOMIC_SLEEP=y

See might_sleep().

Once I built a kernel like that, I could see clearly my bug in visorinput:

BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:48
in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/0
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G         C      4.6.0-ARCH+ #3
Hardware name: Dell Inc. PowerEdge T110/ , BIOS 1.23 12/15/2009
 0000000000000000 ffff88001b003dd0 ffffffff8126644b ffffffff8180b500
 0000000000000030 ffff88001b003de8 ffffffff81080d8a ffffffff8174951e
 ffff88001b003e10 ffffffff81080e09 ffff88001d977008 ffff88001d09e000
Call Trace:
 <IRQ>  [<ffffffff8126644b>] dump_stack+0x4d/0x72
 [<ffffffff81080d8a>] ___might_sleep+0xda/0x110
 [<ffffffff81080e09>] __might_sleep+0x49/0x80
 [<ffffffff81470e20>] down_write+0x20/0x50
 [<ffffffffa014c111>] visorinput_channel_interrupt+0x31/0x2c0 [visorinput]
 [<ffffffffa00ef870>] ? chipset_device_create+0x2a0/0x2a0 [visorbus]
 [<ffffffffa00ef89f>] dev_periodic_work+0x2f/0x50 [visorbus]
 [<ffffffff810bd31a>] call_timer_fn+0x3a/0x160
 [<ffffffffa00ef870>] ? chipset_device_create+0x2a0/0x2a0 [visorbus]
 [<ffffffff810bd607>] run_timer_softirq+0x1c7/0x2d0
 [<ffffffff814752d2>] __do_softirq+0x92/0x2be
 [<ffffffff81061705>] irq_exit+0x75/0x90
 [<ffffffff81475102>] smp_apic_timer_interrupt+0x42/0x50
 [<ffffffff814737cf>] apic_timer_interrupt+0x7f/0x90
 <EOI>  [<ffffffff81026080>] ? default_idle+0x20/0x100
 [<ffffffff8102691f>] arch_cpu_idle+0xf/0x20
 [<ffffffff8109802a>] default_idle_call+0x2a/0x30
 [<ffffffff810982f4>] cpu_startup_entry+0x2c4/0x320
 [<ffffffff81467def>] rest_init+0x7f/0x90
 [<ffffffff818b2ea3>] start_kernel+0x3fa/0x407
 [<ffffffff818b2434>] x86_64_start_reservations+0x2f/0x31
 [<ffffffff818b2520>] x86_64_start_kernel+0xea/0xed

Lesson for developers: Always build your kernel with CONFIG_DEBUG_ATOMIC_SLEEP=y.

selltc commented 8 years ago

Ok, I have a first working draft for the changes to my patches I made to address issues indicated by @mriswyth and @davidker. Currently, these are just based on staging-next, since upstream-next is now completely different. They are in my branch staging-next-selltc-20160524:

commit 24905cfef6fd950a3a2dc9f46ca319c27db914c7 (staging: unisys: visorbus: removed unused periodic_test_workqueue) same as original patch
commit 726b41175e4908d057963da211da7dab49ae588b (staging: unisys: visorinput: remove unnecessary locking) new patch
commit 79ebec42c9f277a88d99e444ff63e2831be015ee (staging: unisys: visorbus: use kernel timer instead of workqueue) revised patch
commit 703e5bdc13abf1544758f207ee0b9fa2311d82d4 (staging: unisys: visorbus: remove periodic_work.h/.c) same as original patch

davidker commented 8 years ago

Thanks Tim,

Let me merge them into upstream-next. I should have something soon that we can look at again.

davidker commented 8 years ago

Tim, can you look at https://github.com/davidker/unisys/tree/earlylook-upstream-next-new and see if it looks okay? I'm compiling it now and will be basing the rest of the code off of it when you give the OK.

selltc commented 8 years ago

That looks perfect to me. Thanks!

selltc commented 8 years ago

I also smoke-tested earlylook-upstream-next-new in my environment, and that went well too.

selltc commented 8 years ago

@davidker, is it your belief that for-later is up-to-date now?

Reason is, I noticed that the locking stuff my commit 726b41175e4908d057963da211da7dab49ae588b (staging: unisys: visorinput: remove unnecessary locking) removes, seems to get mistakenly put back into for-later in commit 3c3e8203b99c20257113fd09cfc1d3159c2fe2d8 (staging: unisys: visorinput: use spinlock for channel_interrupt() locking). Yes, I realize these are different types of locks, but the intent of commit 726b41175e4908d057963da211da7dab49ae588b is to remove the need for any type of locking there.

Just let me know; I feel comfortable enough now with git rebase that I can create my own version of for-later to look like what I think it should. (But no worries -- I will NOT touch the real for-later.)

davidker commented 8 years ago

Yeah, I probably screwed that up for the commit. It was fresh as of 10 am this morning (at least I thought it was).

davidker commented 8 years ago

Tim, the easiest for me is if you can point me to a patch that you want me to squash with that patch in for-later.

selltc commented 8 years ago

Actually, the locking in the for-later patch commit 3c3e8203b99c20257113fd09cfc1d3159c2fe2d8 is now going to require some more thought with the new commit 726b41175e4908d057963da211da7dab49ae588b beneath it, so none of this is your fault. That visorinput patch is truly a thorn in my side; I think the time has come for me to find a better solution.

davidker commented 8 years ago

I have often started down the path of converting visorinput into using a tasklet like visorhba does. And its ISR just schedules the tasklet which then gets run on a kernel thread that can sleep. I kept running into problems you discovered with the rest of for-later which just got me confused.

selltc commented 8 years ago

I don't understand how that would help me. The initial problem of why I couldn't do the mouse resolution change inline had nothing to do with whether I was allowed to sleep or not, and everything to do with deadlocking, i.e.:

/*
 * we can NOT handle this inline, because this may go
 * thru a close() path, which will attempt to stop the
 * worker thread we are running on, which will end up
 * deadlocking
 */

In other words, there was a problem even when I was running in process context and allowed to sleep. The fact that I needed to close() and re-open() the mouse device just to change the resolution made my deadlock/nightmare. This is what I want to look for a cleaner way to do now. If there is NOT a cleaner way, I think I'm just not going to worry about mouse resolution changes, since for practical purposes we are always 1024x768 anyway.

But I'm not sure I agree with you that tasklets can block. Tasklets are implemented via softirq. From http://stackoverflow.com/questions/7135915/which-context-are-softirq-and-tasklet-in:

Technically, softirq's do run in an interrupt context - the "softirq" context; it's just that it's not "hard-irq" context (which is the context when a hardware interrupt occurs).

So, in a softirq handler, in terms of the 'lookup' macros Linux provides:

in_interrupt: yes | in_irq: no | in_softirq: yes | in_serving_softirq: yes

But be aware (beware!!! :): "all of the restrictions that apply to interrupt handlers also apply to bottom halves. Thus, bottom halves cannot sleep, cannot access user space, and cannot invoke the scheduler." -- LDD3.

I have also found statements like this regarding "when should I use workqueues versus taskets?":

Normally, there is little decision between work queues or sotftirqs/tasklets. If the deferred work need to sleep, work queues are used. If the deferred work need not sleep, softirqs or tasklets are used.

But I will read more.

davidker commented 8 years ago

Okay, I'm confused then on why have the deadlock though if during close you disable your interrupts. Doesn't that then mean you can't run through your code? And on the open you enable your interrupts and you run through your code. I'm missing something here.

selltc commented 8 years ago

why have the deadlock though if during close you disable your interrupts

That's exactly the problem (I know I'm using a kernel timer instead of a workqueue now, but because a workqueue can block it is easier to illustrate the crux of the problem that way):

workqueue thread calls visorinput_channel_interrupt()
visorinput_channel_interrupt() calls inputdev->close()
inputdev->close() calls visorbus_disable_channel_interrupts()
visorbus_disable_channel_interrupts() calls cancel_delayed_work_sync() to stop the workqueue
DEADLOCK, because you are currently running a workitem on the workqueue you want to cancel

davidker commented 8 years ago

Yeah.. but in that scenario.. channel_interrupt schedules a tasklet/workqueue. The tasklet/workqueue calls inputdev->close().a tasklet. Close can now call disable_channel_interrupts, because it is no longer on the same thread it was scheduled to a different thread. That should fix the deadlock.

selltc commented 8 years ago

Yes, that's the way I fixed the code to work in for-later.

The reason I explained it this way is because I THOUGHT you were suggesting initially above that if we changed channel_interrupt() to be called from someplace where it could block (e.g., workqueue), that because we could block, there would be no need to use the extra workqueue in channel_interrupt(). So I was explaining why it was NOT that simple. Sorry, I guess we weren't on the same page.

The reason I would like to get rid of this is because it is an extremely complicated, and likely still-racy section of code. We need a complete additional execution context (a workqueue), with associated locking, just to handle a theoretical scenario which never happens in practice (changing screen resolutions in the pd to be different than the 1024x768 resolution we initially provide via the guest EFI framebuffer).

davidker commented 8 years ago

I guess I'm confused though, because both visornic and visorhba have an ISR routine that solely just schedules other work to be done on another queue and frees up the interrupt queue/timer. So the changes in for-later just make visorinput more like visornic and visorhba. It might be more complicated but now all the drivers would follow the same pattern even if they have "real" interrupts or not.

selltc commented 8 years ago

Agreed. But I'm not sure an "input" maintainer, who is only concerned about visorinput, would see it the same way. They would just want visorinput.c to be as simple as possible.

visornic doesn't seem to use a tasklet; it uses napi_schedule(), but it's the same concept. (napi_schedule() just might result in the callback getting called multiple times, depending upon the budget and how much work there is to do.) I think the napi poll() function (e.g., visornic_poll()) is called in atomic context, but darn if I can actually PROVE that. (I at least convinced myself that our code will never block in that path.) I.e., I think napi poll() runs in the same context as a tasklet, which is the same context as a kernel timer.

visorhba uses a tasklet, via a tasklet_schedule() in visorhba_isr(). Of course, this just results in process_incoming_rsps to get called asynchronously; that also appears to run in atomic context and does NOT block, differing from isr context only in that interrupts would be enabled in the tasklet function.

The advantage in visornic and visorhba of doing their real work via napi poll or tasklet is simply because we would be doing this work with interrupts disabled if we did it inline in the interrupt function, because they are using real interrupts.

So yeah, visorinput could be made to act similarly by adding a tasklet to it, then scheduling it to run within visorinput_interrupt(). But the tasklet callback function will STILL have to schedule work on a workqueue if we want to handle the mouse change resolution case, to prevent the deadlock scenario I describe above.

I hope I'm making some sense.

selltc commented 8 years ago

To summarize:

I can totally see the value moving all the normal interrupt work from visorchannel_interrupt() into a tasklet. That's only a few lines of code. That would simply prepare visorinput to behave politely (i.e., NOT do most of its processing work with irqs disabled) if we ever did get real interrupts in the keyboard/mouse path. I think I could argue that case in for-later.

What I really want to OMIT in for-later is commit 975e88fc6bb1751994b910cc3ca89d08f5b890bb (staging: unisys: visorinput: respond to resolution changes on-the-fly) and commit 30a5c6cae5cab53f01f06d20eda186cfa4112a42 (staging: unisys: visorinput: sanity check resolution changes). That would also make commit 9c655c7735ad86840a968885014578b5f9c5d27a (staging: unisys: visorinput: use kref ref-counting for device data struct) unnecessary.

davidker commented 8 years ago

What about switching the lock to a spinlock, especially if we don't think it will be done that often?

davidker commented 8 years ago

Or what if we schedule all the work in visorinput_channel_interrupt to run on a workqueue so we can lock on it.

davidker / unisys