segmentio / parquet-go

Go library to read/write Parquet files
https://pkg.go.dev/github.com/segmentio/parquet-go
Apache License 2.0
341 stars 104 forks source link

Add guard code in sparse array `offset()` #315

Closed joe-elliott closed 2 years ago

joe-elliott commented 2 years ago

This code was suggested by @vc42-2. We have tested it in an environment writing 66k rows/second. Without this change we would see this panic ~10-20 times an hour. This change has been running for ~2 hours with no GC crashes.

I have no idea if this fix is appropriate or not :), but I'd like to move the conversation forward on this issue. It is currently blocking the very nice performance improvements gained by moving to GenericWriter.

Fixes #299.

@vc42-2 Heads up that I submitted this fix. This is your code and I'd be glad to drop this PR in favor of your own.

vc42-2 commented 2 years ago

Joe, that's ok. Just submit your PR.

achille-roussel commented 2 years ago

We merged https://github.com/segmentio/parquet-go/pull/323 as the fix which should have addressed the original issue.

Thanks a lot for reporting and your contributions in the investigation, please let me know if you are still experiencing this problem with the latest version of parquet-go.