brianfrankcooper / YCSB

Yahoo! Cloud Serving Benchmark
Apache License 2.0
4.9k stars 2.22k forks source link

[core] Changed the datatype for keynum from integer to long to allow … #1621

Open ragarkar opened 2 years ago

ragarkar commented 2 years ago

Changed the datatype for keynum from integer to long to allow ycsb to load a larger dataset. Currently, ycsb restricts the number of rows that can be loaded to 2^32 rows.

joshelser commented 2 years ago

Thanks for publishing this PR, Rahul!

For context, we've been running with this change internally. Prior to making this patch, we've been unable to ingest more than ~4TB of data into HBase using the hbase20 binding. Turns out, this was because we kept re-generated the same 2^32 rows after this point. After this change, we've been successful in generating 20TB of data via hbase20 (and into Apache Phoenix via the jdbc binding)

FYI @busbey

busbey commented 2 years ago

there are already several open PRs with attempts to switch from a max record count of MAX_INT to MAX_LONG. please review them to ensure you cover all the changes they do and expressly note that.