tsuna / gohbase

Pure-Go HBase client
Apache License 2.0
732 stars 211 forks source link

How do I know a scan result is returned due to MaxResultSize limit? #224

Closed shortsteel closed 1 year ago

shortsteel commented 1 year ago

Hi all, this SDK has ben incredible in terms of compatibility, performance and usability and has been a great help in my project! Thanks for your amazing work!

I do have a question concerning MaxResultSize maybe you can help explain:

With the Scan operation, sometime I can't get all the rows I want in one request due to the MaxResultSize limit, but there is no way (to the best of my knowledge) to know whether a scan result is returned because of MaxResultSize or due to reach of MaxResultSize.

My current approach is to add additional scans following the initial scan, using the last RowKey of the initial scan result as Start Row, to ensure I've got all the rows I specified in my Range. This means I would need at least 2 scan calls to finish a Range Scan.

Is there a better way to fetch all the rows in my Scan Range instead of doing so such extra scan calls?

dethi commented 1 year ago

MaxResultSize only limits the size of the result returned by HBase at once. When you have a scanner, you can call .Next() to get the next result, no need to open a new scan request with a new beginning row. You should call .Next() until it returns io.EOF to read all the results that match your scan requests.

https://github.com/tsuna/gohbase/blob/master/hrpc/scan.go#L41

shortsteel commented 1 year ago

Thank you @dethi for your reply! It turns out to be a compatibility issue with our cloud service provider (thier version of hbase somehow behaves differently than the official hbase distribution)

dethi commented 1 year ago

Oh interesting. Do you mind sharing the name of the cloud service you use?