resource-disaggregation / snowset

Snowflake dataset containing statistics for 70 million queries over 14 day period
100 stars 21 forks source link

scripts folder missing and how to reproduce the results? #2

Open chtlp opened 2 years ago

chtlp commented 2 years ago

Hi all,

Thanks for sharing the analysis and the dataset, and it has been really helpful to our research on cloud-native OLAP databases. Two issues here:

  1. The scripts/ folder seems to be missing.
  2. How to reproduce the notebook results? The dataset_masked_v3.parquet is missing from download.md. Is it generated by scripts/?
webglider commented 2 years ago

Thank you for your interest in the dataset. dataset_masked_v3.parquet is the same as snowset-main.parquet which you can get from download.md. Thanks for catching this inconsistency. I've pushed an update to fix it. scripts/ is not needed to reproduce the results. I will push an update with the scripts/ directory very soon. In the meantime, you can refer to the notebooks in paper-results/ if you are looking for analysis examples. Feel free to reach out if you face any issues during your analysis.