visorinput: convert lock_visor_dev to a mutex

mriswyth commented 8 years ago

This may require some more thinking, but I noticed it while working on issue #55.

Community feedback resulted in issues #55 and #56 that on the surface seem like they would apply to lock_visor_dev as well since the write portion of the rw_semaphore is a binary semaphore/mutex and the semaphore is only the write semaphore calls are made.

selltc commented 8 years ago

I think this is indeed a valid issue. Thanks Bryan.

arfvidsonUIS commented 8 years ago

proposed fix pushed to branch githubissue#62. Will be testing branch. @davidker

selltc commented 8 years ago

FYI: This was also pointed out by Thomas Gleixner tglx@linutronix.de during his comments from 6/1.

selltc commented 8 years ago

Because tglx pointed this issue out when we attempted to push my fix for github #58 (in particular, commit 726b41175e4908d057963da211da7dab49ae588b), I developed a version of this patch that fits on out-of-staging-visorbus-v2, and pushed it as commit a7642b676b0393ab6a05eefe9e99fa9a12ae672d. This compiles clean, but I haven't had a chance to regression test it yet. I will do that tonight.

Of course there is a new version of checkpatch now, reporting new errors in visorinput, but they appear to be all false-positives, and are unrelated to my patch.

selltc commented 8 years ago

Oh no. Commit a7642b676b0393ab6a05eefe9e99fa9a12ae672d does NOT work. System won't boot with it. Reason is, changing to a mutex apparently reveals a latent defect (that I have fixed in for-later with commit dfa68cfa40a9af4783cad2306bcf5f45e1efe89c - "staging: unisys: visorinput: ensure proper locking wrt creation & ints" for github #58 in for-later), involving the lock being used before it is initialized. I.e., apparently we were "forgiven" for this mistake when it was a semaphore, but are NOT so lucky when converted to a mutex.

I'm getting dizzy from all these up-in-the-air patches.

I'll figure out how to fit this into our current patchdeck... it will probably mean we have to include both commit dfa68cfa40a9af4783cad2306bcf5f45e1efe89c followed by a modified version of commit a7642b676b0393ab6a05eefe9e99fa9a12ae672d.

selltc commented 8 years ago

OK... my 2 patches on out-of-staging-visorbus-v2-selltc-20160602 (based on out-of-staging-visorbus-v2 as of a few minutes ago) work ok:

commit 039e6e517b4a17e2d135a9df85cc1e24a39c2670 - staging: unisys: visorinput: ensure proper locking wrt creation & ints
commit 6f57ed62ae206c23c58ce4a016b08e15639ce9af - staging: unisys: visorinput: make lock_visor_dev a mutex

These patches regression-tested clean, and are checkpatch clean.

So if you could add the above 2 patches for your v3 submission, that would address tglx's 6/1 review comments for visorinput.

selltc commented 8 years ago

We decided NOT to include the above 2 fixes in the v3 patchset, so I sent the following explanation to tglx on 6/3:

+   down_write(&devdata->lock_visor_dev);
Why is this a rw_semaphore? It's only ever taken with down_write() and it's always the same context. Should be a mutex, right?

Correct. We have a local patch that addresses this, but would like to submit this via a follow-on patchset if possible. I'll explain.

Rationale: our intent for this patchset was to focus on the visorbus driver ONLY. The only reason visorinput got involved in the first place was due to the visorbus change that necessitated that we remove the locking from visorinput_channel_interrupt(), due to that now being called from atomic context.

If the semaphore --> mutex change would have been as simple as it sounds, we would have had NO problem including it with the next version (v3) of this patchset. But unfortunately, this change uncovered a latent defect, which necessitated yet another patch. (I know... hard to believe that something this simple would do that, but it did.) Rather than further complicating this patchset, we thought it would be better to address the visorinput issues via a separate follow-on patchset.

Is that acceptable for you?

While at it, please convert the notifier_lock to a mutex as well.

Thanks. Since this is visorbus-specific, we DO plan to address this in v3 of this patchset, which will most-likely just be REMOVING notifier_lock altogether. (Note, this is actually part of the github #69 fix).

davidker commented 8 years ago

@selltc, Tim do you have a cover-letter you want sent?

selltc commented 8 years ago

I'll have a cover letter in a few minutes. But it also just occurred to me that I have not actually run these patches independent of the patches we already have in-flight for visorbus, so I will need to do that if we plan to submit before the in-flight stream is accepted.

davidker commented 8 years ago

They require the patches in flight, they do no apply to greg's tree. So I will either need to make a v5 of the series and add them or get tglx approval to do them later.

selltc commented 8 years ago

It is my impression that tglx would be ok with us doing them later, as his most-recent suggestion was "please fix before visorinput is moved out of staging". That's my vote.

selltc commented 8 years ago

Proposed cover letter:

fix race condition in visorinput dev init, convert semaphore to mutex

This patchset fixes a race condition in visorinput device creation that
would allow the associated input device to be opened and interrupts to
be enabled before we finished initializing the device structure.  We
discovered this race condition when we attempted to convert
visorinput's semaphore to a mutex, which revealed (via a crash) that we
were attempting to access the lock in the visorinput device structure
before it was actually initialized.

The 2nd patch in the series converts visorinput's semaphore to the
more-appropriate mutex.

selltc commented 8 years ago

8/15 - Greg committed these patches into staging-next with:

commit 24ce1b66619c499f5913eac5debf9170f7a9dcc3 - staging: unisys: visorinput: ensure proper locking wrt creation & ints
commit 8dfb8fdbd2d1dd4facef6eef24a26b6bc1b33191 - staging: unisys: visorinput: make lock_visor_dev a mutex

davidker / unisys

visorinput: convert lock_visor_dev to a mutex #62