lukechampine / us

An alternative interface to Sia
MIT License
55 stars 4 forks source link

Context based cancellation of acquiring a session #164

Closed jkawamoto closed 3 years ago

jkawamoto commented 3 years ago

This PR fixes the issue https://github.com/lukechampine/us/issues/106#issuecomment-817186073.

lukechampine commented 3 years ago

Thanks -- implemented in https://github.com/lukechampine/us/commit/04317efb0d804ea0ddb69bbdba2bce66d25d2b62

jkawamoto commented 3 years ago

Does 04317ef fix the problem? It looks like it still blocks if the context is cancelled while reconnecting to the host: https://github.com/lukechampine/us/commit/04317efb0d804ea0ddb69bbdba2bce66d25d2b62#r49426524.

lukechampine commented 3 years ago

Ah, you're right, my bad. I don't think there's a very clean way of adding ctx to reconnect, so I might have to spawn a goroutine after all. :(

lukechampine commented 3 years ago

Okay, think I fixed it in 4c410b4. The behavior is not ideal; the host can remain locked even after ctx is canceled. But it's the best we can do without plumbing ctx through a bunch of other code.

jkawamoto commented 3 years ago

I'm afraid it won't work because if reconnect succeeds after ctx gets cancelled, no one unlocks ls.mu. The goroutine needs to check if the context is cancelled and then release the lock like this:

https://github.com/lukechampine/us/pull/164/files#diff-a2da222a378a564bf5db4896d7a681980e1838f5a16b173687df2028d9e3eb68R34-R36

jkawamoto commented 3 years ago

tryAcquire can also block if the context is cancelled while reconnecting to the host in that function, and we need tryAcquireCtx that runs the same goroutine acquireCtx has.

lukechampine commented 3 years ago

oof. Yeah. Clearly I didn't think this through deeply enough. Your implementation was fine after all, so I'm just going to revert to that. Sorry about that.

lukechampine commented 3 years ago

I think there's still a race:

Will have to think a bit more about this.

lukechampine commented 3 years ago

d23ebc4 should fix the race. 🤞🏻 Both goroutines now select on the same two channels, and the channel is unbuffered. So either:

jkawamoto commented 3 years ago

Thanks for fixing it! But, we have to avoid result parameters because it causes data races:

==================
WARNING: DATA RACE
Write at 0x00c00042e010 by goroutine 16:
  lukechampine.com/us/renter/renterutil.acquireCtx()
      /Users/junpei/src/github.com/lukechampine/us/renter/renterutil/strategies.go:44 +0x3ae
  lukechampine.com/us/renter/renterutil.ParallelChunkDownloader.DownloadChunk.func1()
      /Users/junpei/src/github.com/lukechampine/us/renter/renterutil/strategies.go:683 +0x34e

Previous write at 0x00c00042e010 by goroutine 20:
  lukechampine.com/us/renter/renterutil.acquireCtx.func1()
      /Users/junpei/src/github.com/lukechampine/us/renter/renterutil/strategies.go:30 +0xa4

Goroutine 16 (running) created at:
  lukechampine.com/us/renter/renterutil.ParallelChunkDownloader.DownloadChunk()
      /Users/junpei/src/github.com/lukechampine/us/renter/renterutil/strategies.go:674 +0x671
  lukechampine.com/us/renter/renterutil.ParallelBlobDownloader.DownloadBlob.func1()
      /Users/junpei/src/github.com/lukechampine/us/renter/renterutil/strategies.go:1160 +0x261

Goroutine 20 (running) created at:
  lukechampine.com/us/renter/renterutil.acquireCtx()
      /Users/junpei/src/github.com/lukechampine/us/renter/renterutil/strategies.go:29 +0x2ab
  lukechampine.com/us/renter/renterutil.ParallelChunkDownloader.DownloadChunk.func1()
      /Users/junpei/src/github.com/lukechampine/us/renter/renterutil/strategies.go:683 +0x34e
==================
==================
WARNING: DATA RACE
Write at 0x00c000128050 by goroutine 16:
  lukechampine.com/us/renter/renterutil.acquireCtx()
      /Users/junpei/src/github.com/lukechampine/us/renter/renterutil/strategies.go:44 +0x3db
  lukechampine.com/us/renter/renterutil.ParallelChunkDownloader.DownloadChunk.func1()
      /Users/junpei/src/github.com/lukechampine/us/renter/renterutil/strategies.go:683 +0x34e

Previous write at 0x00c000128050 by goroutine 20:
  lukechampine.com/us/renter/renterutil.acquireCtx.func1()
      /Users/junpei/src/github.com/lukechampine/us/renter/renterutil/strategies.go:30 +0xd5

Goroutine 16 (running) created at:
  lukechampine.com/us/renter/renterutil.ParallelChunkDownloader.DownloadChunk()
      /Users/junpei/src/github.com/lukechampine/us/renter/renterutil/strategies.go:674 +0x671
  lukechampine.com/us/renter/renterutil.ParallelBlobDownloader.DownloadBlob.func1()
      /Users/junpei/src/github.com/lukechampine/us/renter/renterutil/strategies.go:1160 +0x261

Goroutine 20 (running) created at:
  lukechampine.com/us/renter/renterutil.acquireCtx()
      /Users/junpei/src/github.com/lukechampine/us/renter/renterutil/strategies.go:29 +0x2ab
  lukechampine.com/us/renter/renterutil.ParallelChunkDownloader.DownloadChunk.func1()
      /Users/junpei/src/github.com/lukechampine/us/renter/renterutil/strategies.go:683 +0x34e
==================
    testing.go:1092: race detected during execution of test
jkawamoto commented 3 years ago

More precisely, both L30 and L44 can update sess at the same time.

lukechampine commented 3 years ago

oh yeah, I forgot that technically a return will assign to the result params. Jeez. This is what I get for not running tests. Well, 6th time's the charm! a0f7724