Closed agr17 closed 3 months ago
@microsoft-github-policy-service agree company="Universidade da Coruña, A Coruña, Spain"
Thanks, @agr17 ! Structure looks good to me and is consistent with the implementation for other engines such as Spark and Trino.
I think you currently evaluate on native tables. Do you plan to add support for Iceberg tables as well? I think the syntax for that would be CREATE ICEBERG TABLE
, so it might be easy to do it by modifying the build SQL scripts to add a variable and then passing a parameter value--empty for native tables and ICEBERG
for Iceberg tables.
Also, small nit: please update the README.md file to reflect the new profile in the pom.xml file.
Hi @jcamachor , thanks for your feedback! Yes, I was planning to add support for Iceberg tables, my idea was to add it in another Issue/PR.
I had done it, but it is not enough to change one parameter. The data types in Iceberg tables are different from Snowflake types (see https://docs.snowflake.com/en/user-guide/tables-iceberg-data-types). So I created a build_iceberg
folder with the necessary SQL statements and added it to a task_template in library.yaml
called build_iceberg. Additional parameters are required, external_volume
, which is needed in Snowflake to create Iceberg tables, and base_location
, which specifies the folder on the external_volume
where the table should be created.
Iceberg table support is available in commit https://github.com/microsoft/lst-bench/pull/310/commits/e01a7962b4934e33601fd4c12d6595e6fe2dfaa5 (sorry for the name of the commit, it was an error). There are the build_iceberg
folder, the corresponding tasks in library.yaml
and the new parameters for the iceberg tables (exvol and base_location) in sample_experiment_config.yaml
. This was done in a preliminary way, if you prefer a different approach to be in line with the rest of the project, I'm open to any changes.
@agr17 , this looks good, thanks! (Given the current LST-Bench framework, I do not think there is a better approach.) Since the SQL for the second step (inserting into the tables) is the same in both native and Iceberg, you might consider using a single copy of that step to reduce some of the duplication.
Hi @jcamachor . I have fixed the code redundancy using just one build folder with two subfolders, one for native tables and another for iceberg tables. The insert queries are the same files for both in the main build folder.
Merged to main, thanks for your contribution @agr17 !
This is a proposal to add snowflake support to LST-Bench. These are the main changes:
run
folder.src/main/java/com/microsoft/lst_bench/common/SessionExecutor.java
to add time travel in snowflake single_user.With the current content of this PR you can run all workloads for Snowflake. For the time being, no documentation (sample yaml files are available) or tests have been added, I prefer to wait for your feedback and recommendations.
Fix #265