GlareDB / glaredb

GlareDB: An analytics DBMS for distributed data
https://glaredb.com
GNU Affero General Public License v3.0
550 stars 36 forks source link

test: add test for querying large tables successfully #617

Closed vrongmeal closed 1 year ago

vrongmeal commented 1 year ago

Signed-off-by: Vaibhav vrongmeal@gmail.com

vrongmeal commented 1 year ago

Not really keen on using git lfs after reading this: https://docs.github.com/en/repositories/working-with-files/managing-large-files/about-storage-and-bandwidth-usage#tracking-storage-and-bandwidth-use

RustomMS commented 1 year ago

Not really keen on using git lfs after reading this: https://docs.github.com/en/repositories/working-with-files/managing-large-files/about-storage-and-bandwidth-usage#tracking-storage-and-bandwidth-use

Does this every CI run will download this file via git-lfs?

scsmithr commented 1 year ago

Not really keen on using git lfs after reading this: https://docs.github.com/en/repositories/working-with-files/managing-large-files/about-storage-and-bandwidth-usage#tracking-storage-and-bandwidth-use

It's unlikely that we'll be making a lot of changes to the test files, so I don't see this being a concern for us (unless this also counts towards ci usage).

Our current limit is pretty reasonable:

Screenshot 2023-02-09 at 12 10 08 PM

And upgrading limits is pretty cheap:

Screenshot 2023-02-09 at 12 12 43 PM
vrongmeal commented 1 year ago

Pretty sure it's on every CI run. The CI fails because it hasn't downloaded the file yet...

Also, found this: https://github.com/orgs/community/discussions/26775#discussioncomment-3253352

vrongmeal commented 1 year ago

I'd say let's add a script prep-testdata that either downloads the data from gcs or extracts from archive.

scsmithr commented 1 year ago

Well that's interesting. Agree with using a bucket. Went ahead and created a glaredb-testdata bucket here: https://github.com/GlareDB/cloud/pull/616. Also added you to the glaredb-artifacts repo on google cloud, so you should be able to push stuff to that bucket. Github actions is already set up with a service account that has access to that project, so it should just be able to pull objects without issue.

vrongmeal commented 1 year ago

Updated with GCS data!