blugelabs / bluge

indexing library for Go
Apache License 2.0
1.9k stars 124 forks source link

Fix possible runtime panic on DecRef call #34

Closed michaeljs1990 closed 4 years ago

michaeljs1990 commented 4 years ago

The following error was hit when porting from bleve to bluge

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x17000ce]

goroutine 16 [running]:
github.com/blugelabs/bluge/index.(*closeOnLastRefCounter).DecRef(0xc0004b59e0, 0x10ca11a, 0x8000000000000000)
    /gocode/pkg/mod/github.com/blugelabs/bluge@v0.1.3/index/segment_plugin.go:113 +0xae
github.com/blugelabs/bluge/index.(*Snapshot).decRef(0xc0005a0780, 0x3, 0x8ffd3)
    /gocode/pkg/mod/github.com/blugelabs/bluge@v0.1.3/index/snapshot.go:77 +0xb3
github.com/blugelabs/bluge/index.(*Snapshot).Close(...)
    /gocode/pkg/mod/github.com/blugelabs/bluge@v0.1.3/index/snapshot.go:89
github.com/blugelabs/bluge/index.(*Writer).mergerLoop(0xc000069200, 0xc000606660, 0xc0000402a0)
    /gocode/pkg/mod/github.com/blugelabs/bluge@v0.1.3/index/merge.go:75 +0x4c5
created by github.com/blugelabs/bluge/index.OpenWriter
    /gocode/pkg/mod/github.com/blugelabs/bluge@v0.1.3/index/writer.go:131 +0x8cd

It seems like might be possible that https://github.com/blugelabs/bluge/blob/fe1f453e701a72cb3ee01b13b8a5e5f6e2b0f6cc/index/writer.go#L523 could throw a panic also as I don't see any other place segmentWrapper is initialized. It looks like a path exists where err is nil and the returned closer also is.

I am using the bluge.InMemoryOnlyConfig.

michaeljs1990 commented 4 years ago

Based on some naive printf debugging a panic was also possible in writer.go.

mschoch commented 4 years ago

Thanks for reporting this. I think this results from the fact that when we load segments from an in-memory directory, there is no meaningful close operation, and we return nil. Then as you found, elsewhere in the code we expect the closer to be non-nill.

I think this fix is good, but want to review it a bit more before merging.

In the meantime, can you push another commit, adding an appropriate entry to the AUTHORS file for copyright purposes?

mschoch commented 4 years ago

So, looking at this a bit more, I realize there are several paths to be considered (every place that calls Load() on a directory). This include snapshots, as well as segments, and also places like the OfflineWriter.

So now I'm wondering if we should update the contract of the Directory interface to always return a non-nil closer. And we can simply have a singleton noopCloser that does nothing that everyone can return if needed. That will reduce the burden on all consumers of the Directory, as they can simply call Close() without additional checks.

michaeljs1990 commented 4 years ago

So, looking at this a bit more, I realize there are several paths to be considered (every place that calls Load() on a directory). This include snapshots, as well as segments, and also places like the OfflineWriter.

So now I'm wondering if we should update the contract of the Directory interface to always return a non-nil closer. And we can simply have a singleton noopCloser that does nothing that everyone can return if needed. That will reduce the burden on all consumers of the Directory, as they can simply call Close() without additional checks.

I was trying to think through how this would look in implementation. A quick try at that resulted in something like the following.

type noopCloser func() error

func (c noopCloser) Close() error {
    return nil
}

Then for all returns from load noopCloser is returned instead of nil. I don't know of anyway to enforce via the directory interface however that the return value is non nil so even though we can make sure that at present it's safe to call close a change down that line could break that assumption.

One option might be to enforce that instead of an io.Closer or other interface being returned a function is directly expected to be returned instead.... never mind nil is also a valid return type for a function.

mschoch commented 4 years ago

Yeah, I think you've persuaded me. Since the Directory is an important new way to extend Bluge, we want to encourage other implementations, and that means consumers have to be prepared to deal with a nil closer.

Tomorrow I'll do a deeper review, as there are a few more places not currently captured in this PR.

mschoch commented 4 years ago

@michaeljs1990 I pushed another commit, trying to address all the places Bluge internally uses the io.Closer returned from the Load method. If this looks good to you, we can merge.

michaeljs1990 commented 4 years ago

@michaeljs1990 I pushed another commit, trying to address all the places Bluge internally uses the io.Closer returned from the Load method. If this looks good to you, we can merge.

Looks great to me. Seems that I missed a few 😄