databrickslabs / ucx

Automated migrations to Unity Catalog
Other
221 stars 77 forks source link

Update databricks-labs-lsql requirement from <0.11,>=0.5 to >=0.5,<0.12 #2666

Closed dependabot[bot] closed 2 weeks ago

dependabot[bot] commented 2 weeks ago

Updates the requirements on databricks-labs-lsql to permit the latest version.

Release notes

Sourced from databricks-labs-lsql's releases.

v0.11.0

  • Added filter spec implementation (#276). In this commit, a new FilterHandler class has been introduced to handle filter files with the suffix .filter.json, which can parse filter specifications in the header of the filter file and validate the filter columns and types. The commit also adds support for three types of filters: DATE_RANGE_PICKER, MULTI_SELECT, and DROPDOWN, which can be linked with multiple visualization widgets. Additionally, a FilterTile class has been added to the Tile class, which represents a filter tile in the dashboard and includes methods to validate the tile, create widgets, and generate filter encodings and queries. The DashboardMetadata class has been updated to include a new method get_datasets() to retrieve the datasets for the dashboard. These changes enhance the functionality of the dashboard by adding support for filtering data using various filter types and linking them with multiple visualization widgets, improving the customization and interactivity of the dashboard, and making it more user-friendly and efficient.
  • Bugfix: MockBackend wasn't mocking savetable properly when the mode is append (#289). This release includes a bugfix and enhancements for the MockBackend component, which is used to mock the SQLBackend. The .savetable() method failed to function as expected in append mode, writing all rows to the same table instead of accumulating them. This bug has been addressed, ensuring that rows accumulate correctly in append mode. Additionally, a new test function, test_mock_backend_save_table_overwrite(), has been added to demonstrate the corrected behavior of overwrite mode, showing that it now replaces only the existing rows for the given table while preserving other tables' contents. The type signature for .save_table() has been updated, restricting the mode parameter to accept only two string literals: "append" and "overwrite". The MockBackend behavior has been updated accordingly, and rows are now filtered to exclude any None or NULL values prior to saving. These improvements to the MockBackend functionality and test suite increase reliability when using the MockBackend as a testing backend for the system.
  • Changed filter spec to use YML instead of JSON (#290). In this release, the filter specification files have been converted from JSON to YAML format, providing a more human-readable format for the filter specifications. The schema for the filter file includes flags for column, columns, type, title, description, order, and id, with the type flag taking on values of DROPDOWN, MULTI_SELECT, or DATE_RANGE_PICKER. This change impacts the FilterHandler, is_filter method, and _from_dashboard_folder method, as well as relevant parts of the documentation. Additionally, the parsing methods have been updated to use yaml.safe_load instead of json.loads, and the is_filter method now checks for .filter.yml suffix. A new file, '00_0_date.filter.yml', has been added to the 'tests/integration/dashboards/filter_spec_basic' directory, containing a sample date filter definition. Furthermore, various tests have been added to validate filter specifications, such as checking for invalid type and both column and columns keys being present. These updates aim to enhance readability, maintainability, and ease of use for filter configuration.
  • Increase testing of generic types storage (#282). A new commit enhances the testing of generic types storage by expanding the test suite to include a list of structs, ensuring more comprehensive testing of the system. The Foo struct has been renamed to Nested for clarity, and two new structs, NestedWithDict and Nesting, have been added. The Nesting struct contains a Nested object, while NestedWithDict includes a string and an optional dictionary of strings. A new test case demonstrates appending complex types to a table by creating and saving a table with two rows, each containing a Nesting struct. The test then fetches the data and asserts the expected number of rows are returned, ensuring the proper functioning of the storage system with complex data types.
  • Minor Changes to avoid redundancy in code and follow code patterns (#279). In this release, we have made significant improvements to the dashboards.py file to make the code more concise, maintainable, and in line with the standard library's recommended usage. The export_to_zipped_csv method has undergone major changes, including the removal of the BytesIO module import and the use of StringIO for handling strings as files. The method no longer creates a separate ZIP file for the CSV files, instead using the provided export_path. Additionally, the method skips tiles that don't contain queries. We have also introduced a new method, dataclass_transform, which transforms a given dataclass into a new one with specific attributes and behavior. This method creates a new dataclass with a custom metaclass and adds a new method, to_dict(), which converts the instances of the new dataclass to dictionaries. These changes promote code reusability and reduce redundancy in the codebase, making it easier for software engineers to work with.
  • New example with bar chart in dashboards-as-code (#281). A new example of a dashboard featuring a bar chart has been added to the dashboards-as-code feature using the existing metadata overrides feature to support the new widget type, without bloating the TileMetadata structure. An integration test was added to demonstrate the creation of a bar chart, and the resulting dashboard can be seen in the attached screenshot. Additionally, a new SQL file has been added for the Product Sales dashboard, showcasing sales data for different product categories. This approach can potentially be used to support other widget types such as Bar, Pivot, Area, etc. The team is encouraged to provide feedback on this proposed solution.

Contributors: @​JCZuurmond, @​bishwajit-db, @​ericvergnaud, @​jgarciaf106, @​asnare

Changelog

Sourced from databricks-labs-lsql's changelog.

0.11.0

  • Added filter spec implementation (#276). In this commit, a new FilterHandler class has been introduced to handle filter files with the suffix .filter.json, which can parse filter specifications in the header of the filter file and validate the filter columns and types. The commit also adds support for three types of filters: DATE_RANGE_PICKER, MULTI_SELECT, and DROPDOWN, which can be linked with multiple visualization widgets. Additionally, a FilterTile class has been added to the Tile class, which represents a filter tile in the dashboard and includes methods to validate the tile, create widgets, and generate filter encodings and queries. The DashboardMetadata class has been updated to include a new method get_datasets() to retrieve the datasets for the dashboard. These changes enhance the functionality of the dashboard by adding support for filtering data using various filter types and linking them with multiple visualization widgets, improving the customization and interactivity of the dashboard, and making it more user-friendly and efficient.
  • Bugfix: MockBackend wasn't mocking savetable properly when the mode is append (#289). This release includes a bugfix and enhancements for the MockBackend component, which is used to mock the SQLBackend. The .savetable() method failed to function as expected in append mode, writing all rows to the same table instead of accumulating them. This bug has been addressed, ensuring that rows accumulate correctly in append mode. Additionally, a new test function, test_mock_backend_save_table_overwrite(), has been added to demonstrate the corrected behavior of overwrite mode, showing that it now replaces only the existing rows for the given table while preserving other tables' contents. The type signature for .save_table() has been updated, restricting the mode parameter to accept only two string literals: "append" and "overwrite". The MockBackend behavior has been updated accordingly, and rows are now filtered to exclude any None or NULL values prior to saving. These improvements to the MockBackend functionality and test suite increase reliability when using the MockBackend as a testing backend for the system.
  • Changed filter spec to use YML instead of JSON (#290). In this release, the filter specification files have been converted from JSON to YAML format, providing a more human-readable format for the filter specifications. The schema for the filter file includes flags for column, columns, type, title, description, order, and id, with the type flag taking on values of DROPDOWN, MULTI_SELECT, or DATE_RANGE_PICKER. This change impacts the FilterHandler, is_filter method, and _from_dashboard_folder method, as well as relevant parts of the documentation. Additionally, the parsing methods have been updated to use yaml.safe_load instead of json.loads, and the is_filter method now checks for .filter.yml suffix. A new file, '00_0_date.filter.yml', has been added to the 'tests/integration/dashboards/filter_spec_basic' directory, containing a sample date filter definition. Furthermore, various tests have been added to validate filter specifications, such as checking for invalid type and both column and columns keys being present. These updates aim to enhance readability, maintainability, and ease of use for filter configuration.
  • Increase testing of generic types storage (#282). A new commit enhances the testing of generic types storage by expanding the test suite to include a list of structs, ensuring more comprehensive testing of the system. The Foo struct has been renamed to Nested for clarity, and two new structs, NestedWithDict and Nesting, have been added. The Nesting struct contains a Nested object, while NestedWithDict includes a string and an optional dictionary of strings. A new test case demonstrates appending complex types to a table by creating and saving a table with two rows, each containing a Nesting struct. The test then fetches the data and asserts the expected number of rows are returned, ensuring the proper functioning of the storage system with complex data types.
  • Minor Changes to avoid redundancy in code and follow code patterns (#279). In this release, we have made significant improvements to the dashboards.py file to make the code more concise, maintainable, and in line with the standard library's recommended usage. The export_to_zipped_csv method has undergone major changes, including the removal of the BytesIO module import and the use of StringIO for handling strings as files. The method no longer creates a separate ZIP file for the CSV files, instead using the provided export_path. Additionally, the method skips tiles that don't contain queries. We have also introduced a new method, dataclass_transform, which transforms a given dataclass into a new one with specific attributes and behavior. This method creates a new dataclass with a custom metaclass and adds a new method, to_dict(), which converts the instances of the new dataclass to dictionaries. These changes promote code reusability and reduce redundancy in the codebase, making it easier for software engineers to work with.
  • New example with bar chart in dashboards-as-code (#281). A new example of a dashboard featuring a bar chart has been added to the dashboards-as-code feature using the existing metadata overrides feature to support the new widget type, without bloating the TileMetadata structure. An integration test was added to demonstrate the creation of a bar chart, and the resulting dashboard can be seen in the attached screenshot. Additionally, a new SQL file has been added for the Product Sales dashboard, showcasing sales data for different product categories. This approach can potentially be used to support other widget types such as Bar, Pivot, Area, etc. The team is encouraged to provide feedback on this proposed solution.

0.10.0

  • Added Functionality to export any dashboards-as-code into CSV (#269). The DashboardMetadata class now includes a new method, export_to_zipped_csv, which enables exporting any dashboard as CSV files in a ZIP archive. This method accepts sql_backend and export_path as parameters and exports dashboard queries to CSV files in the specified ZIP archive by iterating through tiles and fetching dashboard queries if the tile is a query. To ensure the proper functioning of this feature, unit tests and manual testing have been conducted. A new test, test_dashboards_export_to_zipped_csv, has been added to verify the correct export of dashboard data to a CSV file.
  • Added support for generic types in SqlBackend (#272). In this release, we've added support for using rich dataclasses, including those with optional and generic types, in the SqlBackend of the StatementExecutionBackend class. The new functionality is demonstrated in the test_supports_complex_types unit test, which creates a Nested dataclass containing various complex data types, such as nested dataclasses, datetime objects, dict, list, and optional fields. This enhancement is achieved by updating the save_table method to handle the conversion of complex dataclasses to SQL statements. To facilitate type inference, we've introduced a new StructInference class that converts Python dataclasses and built-in types to their corresponding SQL Data Definition Language (DDL) representations. This addition simplifies data definition and manipulation operations while maintaining type safety and compatibility with various SQL data types.

0.9.3

  • Added documentation for exclude flag (#265). A new exclude flag has been added to the configuration file for our lab tool, allowing users to specify a path to exclude from formatting during lab execution. This release also includes corrections to grammatical errors in the descriptions of existing flags related to catalog and database settings, such as updating seperated to "separate". Additionally, the flag descriptions for publish and open-browser have been updated for clarification: publish now clearly controls whether the dashboard is published after creation, while open-browser controls whether the dashboard is opened in a web browser. These changes are aimed at improving user experience and ease of use for our lab tool.
  • Fixed dataclass field type in _row_to_sql (#266). In this release, we have addressed an issue related to #257 by fixing the dataclass field type in the _row_to_sql method of the backends.py file. Additionally, we have made updates to the _schema_for method to use a new _field_type class method. This change resolves a rare problem where the field.type is a string instead of a type and ensures compatibility with a pull request from an external repository (databrickslabs/ucx#2526). The new _field_type method attempts to load the type from __builtins__ if it's a string and logs a warning if it fails. The _row_to_sql method now consistently uses the _field_type method to get the field type. This ensures that the library functions seamlessly and consistently, avoiding any potential issues in the future.

0.9.2

  • Make hatch a prerequisite (#259). In this commit, Eric Vergnaud has introduced a change to make the installation of hatch version 1.9.4 a prerequisite for the project to avoid errors related to pip command recognition. The Makefile has been updated to handle the installation of hatch automatically, and the hatch env create command is now used instead of pip install hatch==1.7.0. This change ensures that the development environment is consistent and reliable by maintaining the correct version of hatch and automatically handling its installation. Additionally, the .venv/bin/python and dev targets have been updated accordingly to reflect these changes. This commit also formats all files using the make dev fmt command, which helps maintain consistent code formatting throughout the project.
  • add support for exclusions in fmt command (#263). In this release, we have added support for exclusions to the fmt command in the 'databricks/labs/lsql/cli.py' module. This feature allows users to specify a list of directories or files to exclude while formatting SQL files, which is particularly useful when verifying SQL notebooks in ucx. The fmt command now accepts a new optional parameter 'exclude', which accepts an iterable of strings that specify the relative paths to exclude. Any sql_file that is a descendant of any exclusion is skipped during formatting. The exclusions are implemented by converting the relative paths into Path objects. This change addresses the issue where single line comments are converted into inlined comments, causing misinterpretation. The added unit test is manually verified, and this pull request fixes issue #261. This feature was authored and co-authored by Eric Vergnaud.

0.9.1

  • Fixed dataclass field types (#257). This PR introduces a workaround to a Python bug affecting the dataclasses.fields() function, which sometimes returns field types as string type names instead of types. This can cause the ORM to malfunction. The workaround involves checking if the returned f.type is a string, and if so, converting it to a type by looking it up in the __builtins__ dictionary. This change is global and affects the _schema_for function in the backends.py file, which is responsible for creating a schema for a given dataclass, taking into account any necessary type conversions. This change ensures consistent and accurate type handling in the face of the Python bug, improving the reliability of our ORM.
  • Fixed missing EOL when formatting SQL files (#260). In this release, we have addressed an issue related to the inconsistent addition of end-of-line (EOL) characters in formatted SQL files. The QueryTile.format() method has been updated to ensure that an EOL character is always added, except when the input query already ends with a newline. This change enhances the reliability of the SQL formatting functionality, making the output format more predictable and improving the overall user experience. The new implementation is demonstrated in the test_query_format_preserves_eol() test case, and existing test cases have been updated to check for the presence of EOL characters, further ensuring consistent and correct formatting.
  • Fixed normalize case input in cli (#258). In this release, we have updated the fmt command in the cli.py file to allow users to specify whether they want to normalize the case of SQL files when formatting. The normalize_case parameter now defaults to the string "true" and checks if it is in the STRING_AFFIRMATIVES list to determine whether to normalize the case of SQL files. Additionally, we have introduced a new optional normalize_case parameter in the format method of the dashboards.py file in the Databricks CLI, which normalizes the identifiers in the query to lower case when set to True. We have also added support for a new normalize_case parameter in the QueryTile.format() method, which prevents the automatic normalization of string input to uppercase when set to False. This change allows for more flexibility in handling string input and ensures that the input string is preserved as-is. These updates improve the functionality and usability of the open-source library, providing more control to users over formatting and handling of string input.

0.9.0

  • Added design for filter file (#251). A new feature has been added to enable the creation of filters for multiple widgets in a dashboard using a .filter.json file. This file allows users to specify columns to be filtered, the filter type, title, description, order, and a unique ID for each filter. Both the column and columns flags are supported, with the former taking a single string and the latter taking a list of strings. The filter type can be set to a drop-down menu or another type as desired. The .filter.json file schema also supports optional title and description strings, as well as order and ID flags. An example of a .filter.json file is provided in the commit message. Additionally, the dashboard.yml file documentation has been updated to include information on how to use the new .filter.json file.
  • adding normalize-case option to databricks labs lsql fmt cmd (#254). In this open-source library release, the databricks labs lsql tool's fmt command now supports a new flag, normalize-case. This flag allows users to control the normalization of query text to lowercase, providing more flexibility when formatting SQL queries. By default, query text is still normalized to lowercase, but users can now prevent this behavior by setting the normalize-case flag to False. This change addresses an issue where some queries are case sensitive, such as those using map field keys in UCX dashboards. Additionally, a new parameter normalize_case has been added to the format method in the dashboards.py file, with updated method documentation. A new test function, test_query_formats_no_normalize(), has also been included to ensure consistent formatter behavior.

0.8.0

  • Removed deploy_dashboard method (#240). In this release, the deploy_dashboard method has been removed from the dashboards.py file and the legacy deployment method has been deprecated. The deploy_dashboard method was previously used to deploy a dashboard to a workspace, but it has been replaced with the create method of the lakeview attribute of the WorkspaceClient object. Additionally, the test_dashboards_creates_dashboard_via_legacy_method method has been removed. A new test has been added to ensure that the deploy_dashboard method is no longer being used, utilizing the deprecated_call function from pytest to verify that calling the method raises a deprecation warning. This change simplifies the code and improves the overall design of the system, resolving issue #232. The _with_better_names method and create_dashboard method remain unchanged.
  • Skip test that fails due to insufficient permission to create schema (#248). A new test function, test_dashboards_creates_dashboard_with_replace_database, has been added to the open-source library, but it is currently marked to be skipped due to missing permissions to create a schema. This function creates an instance of the Dashboards class with the ws parameter, creates a dashboard using the make_dashboard function, and performs various actions using the created dashboard, as well as functions such as tmp_path and sql_backend. This test function aims to ensure that the Dashboards class functions as expected when creating a dashboard with a replaced database. Once the necessary permissions for creating a schema are acquired, this test function can be enabled for further testing and validation.
  • Updates to use the Databricks Python sdk 0.30.0 (#247). In this release, we have updated the project to use Databricks Python SDK version 0.30.0. This update includes changes to the execute and fetch_value functions, which now use the new StatementResponse type instead of ExecuteStatementResponse. A conditional import statement has been added to maintain compatibility with both Databricks SDK versions 0.30.0 and below. The execute function now raises TimeoutError when the specified timeout is greater than 50 seconds and the statement execution hasn't finished. Additionally, the fetch_value function has been updated to handle the case when the execute function returns None. The unit test file test_backends.py has also been updated to reflect these changes, with multiple test functions now using the StatementResponse class instead of ExecuteStatementResponse. These changes improve the system's compatibility with the latest version of the Databricks SDK, ensuring that the core functionality of the SDK continues to work as expected.

0.7.5

... (truncated)

Commits


Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)