segmentio / parquet-go

Go library to read/write Parquet files
https://pkg.go.dev/github.com/segmentio/parquet-go
Apache License 2.0
341 stars 58 forks source link

close rowGroupCursors when Popped due to io.EOF in mergedRowGroupRows #337

Closed asubiotto closed 2 years ago

asubiotto commented 2 years ago

This commit fixes a potential goroutine leak. mergeRowGroupRows cursors, when initialized, might spawn a goroutine (e.g. asyncPages.readPages). Previously, when a cursor returned io.EOF, this cursor was popped from the min-heap but never closed, resulting in leaked goroutines.

cc @achille-roussel @Pryz

asubiotto commented 2 years ago

Any pointers on where to add a regression test?

joe-elliott commented 2 years ago

We are possibly seeing this in a rollout today. We have ~100s of orphaned goroutines parked here:

https://github.com/segmentio/parquet-go/blob/main/page.go#L222

The goroutines do seem to be slowly falling off. Currently still investigating.

image

Is that consistent with this issue?

Edit: Ignore. We don't use MergeRowGroups. Likely unrelated.

achille-roussel commented 2 years ago

If you are using the Merge* functions it would be related.

Alternatively you might be missing a Close call on Rows instances in the code using parquet-go.

joe-elliott commented 2 years ago

Yup, this was my bad. Saw some new behavior on a rollout and was concerned, but this is attributable to other changes in the rollout. I'll leave my original post only for continuity of the conversation :P

asubiotto commented 2 years ago

Updated and added test.

asubiotto commented 2 years ago

@joe-elliott there are some finalizers set up to close resources on GC (including the goroutines leaked by the behavior this PR fixes). I found it helpful to disable those and run a leak detector to track down this leak in case it you suspect it's parquet-related.