ashish10alex / vscode-dataform-tools

Dataform tools - a vscode extension
https://marketplace.visualstudio.com/items?itemName=ashishalex.dataform-lsp-vscode
MIT License
20 stars 5 forks source link

Feature Request - Preview Results / Query #41

Open benjaminwestern opened 6 days ago

benjaminwestern commented 6 days ago

Firstly.... Thank you for this tool!

It has saved countless hours of going back and forth to the GCP Console to validate columns, queries, run costs etc.

I was wondering if you would be able to update your 'Dataform Tools' feature slightly

image

If possible I would love to be able to set a 'sample' size as well as a custom 'limit' size. See Documentation here

In addition to this. Being able to see the columns and types of a given SQLX table definition would be awesome... using INFORMATION_SCHEMA.Columns like:

Documentation

SELECT
column_name, data_type
FROM
`<your-gcp-project-id>`.<your-bigquery-dataset-name>.INFORMATION_SCHEMA.COLUMNS
WHERE
table_name="<your-bigquery-table-name>"

I will attempt to write a PR for these later this week but I also just wanted to share my appreciation for your extension!

Cheers!

ashish10alex commented 5 days ago

Hi @benjaminwestern , thanks for your kind words. For the features requested in the issue, I have implemented one of them as I could not help myself.

CleanShot 2024-11-07 at 14 52 31@2x

HampB commented 5 days ago

I would also like to see the option to specify a sample size. Unlike LIMIT, TABLESAMPLE reduces the actual cost of a query. This feature would be beneficial when working with large datasets.

ashish10alex commented 5 days ago

cool Ill wait for @benjaminwestern to submit a PR but it might be good to agree to a UI for having both LIMIT & TABLESAMPLE for it be intuitive for the users

benjaminwestern commented 5 days ago

Thanks Legend! haha I love the idea of seeing the values pre-creation so then we can validate what the query will generate.

The reason I am interested in table sample is to validate my utilities across a random sampling of data from the table prior to fully releasing the code. Say for example I have a mobile number validation dataform utility, I would want to pre-validate this across a random sampling of the data available in the destination before committing to a potentially expensive and incorrect operation.

In relation to LIMIT, I 100% agree that large tables need to have limit forced, I will look at pushing a PR to also increment an Offset so users can paginate through their data in a deterministic way Documentation

ashish10alex commented 5 days ago

Thanks @benjaminwestern , sounds great, looking forward to the PR :)