databrickslabs / remorph

Accelerates migrations to Databricks by automating code conversion and migration validation
Other
47 stars 31 forks source link

Update databricks-labs-lsql requirement from <0.14.0,>=0.7.5 to >=0.7.5,<0.15.0 #1220

Closed dependabot[bot] closed 1 week ago

dependabot[bot] commented 1 week ago

Updates the requirements on databricks-labs-lsql to permit the latest version.

Release notes

Sourced from databricks-labs-lsql's releases.

v0.14.0

  • Added nightly tests run at 4:45am UTC (#318). A new nightly workflow has been added to the codebase, designed to automate a series of jobs every day at 4:45am UTC on the larger environment. The workflow includes permissions for writing id-tokens, accessing issues, reading contents and pull-requests. It checks out the code with a full fetch-depth, installs Python 3.10, and uses hatch 1.9.4. The key step in this workflow is the execution of nightly tests using the databrickslabs/sandbox/acceptance action, which creates issues if necessary. The workflow utilizes several secrets, including VAULT_URI, GITHUB_TOKEN, ARM_CLIENT_ID, and ARM_TENANT_ID, and sets the TEST_NIGHTLY environment variable to true. Additionally, the workflow is part of a concurrency group called "single-acceptance-job-per-repo", ensuring that only one acceptance job runs at a time per repository.
  • Bump codecov/codecov-action from 4 to 5 (#319). In this version update, the Codecov GitHub Action has been upgraded from 4 to 5, bringing improved functionality and new features. This new version utilizes the Codecov Wrapper to encapsulate the CLI, enabling faster updates. Additionally, an opt-out feature has been introduced for tokens in public repositories, allowing contributors and other members to upload coverage reports without requiring access to the Codecov token. The upgrade also includes changes to the arguments: file is now deprecated and replaced with files, and plugin is deprecated and replaced with plugins. New arguments have been added, including binary, gcov_args, gcov_executable, gcov_ignore, gcov_include, report_type, skip_validation, and swift_project. Comprehensive documentation on these changes can be found in the release notes and changelog.
  • Fixed RuntimeBackend exception handling (#328). In this release, we have made significant improvements to the exception handling in the RuntimeBackend component, addressing issues reported in tickets #328, #327, #326, and #325. We have updated the execute and fetch methods to handle exceptions more gracefully and changed exception handling from catching Exception to catching BaseException for more comprehensive error handling. Additionally, we have updated the pyproject.toml file to use a newer version of the databricks-labs-pytester package (0.2.1 to 0.5.0) which may have contributed to the resolution of these issues. Furthermore, the test_backends.py file has been updated to improve the readability and user-friendliness of the test output for the functions testing if a NotFound, BadRequest, or Unknown exception is raised when executing and fetching statements. The test_runtime_backend_use_statements function has also been updated to print PASSED or FAILED instead of returning those values. These changes enhance the robustness of the exception handling mechanism in the RuntimeBackend class and update related unit tests.

Dependency updates:

  • Bump codecov/codecov-action from 4 to 5 (#319).

Contributors: @​nfx, @​JCZuurmond, @​dependabot[bot]

Changelog

Sourced from databricks-labs-lsql's changelog.

0.14.0

  • Added nightly tests run at 4:45am UTC (#318). A new nightly workflow has been added to the codebase, designed to automate a series of jobs every day at 4:45am UTC on the larger environment. The workflow includes permissions for writing id-tokens, accessing issues, reading contents and pull-requests. It checks out the code with a full fetch-depth, installs Python 3.10, and uses hatch 1.9.4. The key step in this workflow is the execution of nightly tests using the databrickslabs/sandbox/acceptance action, which creates issues if necessary. The workflow utilizes several secrets, including VAULT_URI, GITHUB_TOKEN, ARM_CLIENT_ID, and ARM_TENANT_ID, and sets the TEST_NIGHTLY environment variable to true. Additionally, the workflow is part of a concurrency group called "single-acceptance-job-per-repo", ensuring that only one acceptance job runs at a time per repository.
  • Bump codecov/codecov-action from 4 to 5 (#319). In this version update, the Codecov GitHub Action has been upgraded from 4 to 5, bringing improved functionality and new features. This new version utilizes the Codecov Wrapper to encapsulate the CLI, enabling faster updates. Additionally, an opt-out feature has been introduced for tokens in public repositories, allowing contributors and other members to upload coverage reports without requiring access to the Codecov token. The upgrade also includes changes to the arguments: file is now deprecated and replaced with files, and plugin is deprecated and replaced with plugins. New arguments have been added, including binary, gcov_args, gcov_executable, gcov_ignore, gcov_include, report_type, skip_validation, and swift_project. Comprehensive documentation on these changes can be found in the release notes and changelog.
  • Fixed RuntimeBackend exception handling (#328). In this release, we have made significant improvements to the exception handling in the RuntimeBackend component, addressing issues reported in tickets #328, #327, #326, and #325. We have updated the execute and fetch methods to handle exceptions more gracefully and changed exception handling from catching Exception to catching BaseException for more comprehensive error handling. Additionally, we have updated the pyproject.toml file to use a newer version of the databricks-labs-pytester package (0.2.1 to 0.5.0) which may have contributed to the resolution of these issues. Furthermore, the test_backends.py file has been updated to improve the readability and user-friendliness of the test output for the functions testing if a NotFound, BadRequest, or Unknown exception is raised when executing and fetching statements. The test_runtime_backend_use_statements function has also been updated to print PASSED or FAILED instead of returning those values. These changes enhance the robustness of the exception handling mechanism in the RuntimeBackend class and update related unit tests.

Dependency updates:

  • Bump codecov/codecov-action from 4 to 5 (#319).

0.13.0

  • Added escape_name function to escape individual SQL names and escape_full_name function to escape dot-separated full names (#316). Two new functions, escape_name and escape_full_name, have been added to the databricks.labs.lsql.escapes module for escaping SQL names. The escape_name function takes a single name as an input and returns it enclosed in backticks, while escape_full_name handles dot-separated full names by escaping each individual component. These functions have been ported from the databrickslabs/ucx repository and are designed to provide a consistent way to escape names and full names in SQL statements, improving the robustness of the system by preventing issues caused by unescaped special characters in SQL names. The test suite includes various cases, including single names, full names with different combinations of escaped and unescaped components, and special characters, with a specific focus on the scenario where the column name contains a period.
  • Bump actions/checkout from 4.2.0 to 4.2.1 (#304). In this pull request, the actions/checkout dependency is updated from version 4.2.0 to 4.2.1 in the .github/workflows/release.yml file. This update includes a new feature where refs/* are checked out by commit if provided, falling back to the ref specified by the @orhantoy user. This change improves the flexibility of the action, allowing users to specify a commit or branch for checkout. The pull request also introduces a new contributor, @Jcambass, who added a workflow file for publishing releases to an immutable action package. The commits for this release include changes to prepare for the 4.2.1 release, add a workflow file for publishing releases, and check out other refs/* by commit if provided, falling back to ref. This pull request has been reviewed and approved by Dependabot.
  • Bump actions/checkout from 4.2.1 to 4.2.2 (#310). This is a pull request to update the actions/checkout dependency from version 4.2.1 to 4.2.2, which includes improvements to the url-helper.ts file that now utilize well-known environment variables and expanded unit test coverage for the isGhes function. The actions/checkout action is commonly used in GitHub Actions workflows for checking out a repository at a specific commit or branch. The changes in this update are internal to the actions/checkout action and should not affect the functionality of the project utilizing this action. The pull request also includes details on the commits and compatibility score for the upgrade, and reviewers can manage and merge the request using Dependabot commands once the changes have been verified.
  • Bump databrickslabs/sandbox from acceptance/v0.3.0 to 0.3.1 (#307). In this release, the databrickslabs/sandbox dependency has been updated from version acceptance/v0.3.0 to 0.3.1. This update includes previously tagged commits, bug fixes for git-related libraries, and resolution of the unsupported protocol scheme error. The README has been updated with more information on using the databricks labs sandbox command, and installation instructions have been improved. Additionally, there have been dependency updates for go-git libraries and golang.org/x/crypto in the /go-libs and /runtime-packages directories. New commits in this release allow larger logs from acceptance tests and implement experimental OIDC refresh functionality. Ignore conditions have been applied to prevent conflicts with previous versions of the dependency. This update is recommended for users who want to take advantage of the latest bug fixes and improvements.
  • Bump databrickslabs/sandbox from acceptance/v0.3.1 to 0.4.2 (#315). In this release, the databrickslabs/sandbox dependency has been updated from version acceptance/v0.3.1 to 0.4.2. This update includes bug fixes, dependency updates, and additional go-git libraries. Specifically, the Run integration tests job in the GitHub Actions workflow has been updated to use the new version of the databrickslabs/sandbox/acceptance Docker image. The updated version also includes install instructions, usage instructions in the README, and a modification to provide more git-related libraries. Additionally, there were several updates to dependencies, including golang.org/x/crypto version 0.16.0 to 0.17.0. Dependabot, a tool that manages dependencies in GitHub projects, is responsible for the update and provides instructions for resolving any conflicts or merging the changes into the project. This update is intended to improve the functionality and reliability of the databrickslabs/sandbox dependency.
  • Deprecate Row.as_dict() (#309). In this release, we are introducing a deprecation warning for the as_dict() method in the Row class, which will be removed in favor of the asDict() method. This change aims to maintain consistency with Spark's Row behavior and prevent subtle bugs when switching between different backends. The deprecation warning will be implemented using Python's warnings mechanism, including the new annotation in Python 3.13 for static code analysis. The existing functionality of fetching values from the database through StatementExecutionExt remains unchanged. We recommend that clients update their code to use .asDict() instead of .as_dict() to avoid any disruptions. A new test case test_row_as_dict_deprecated() has been added to verify the deprecation warning for Row.as_dict().
  • Minor improvements for .save_table(mode="overwrite") (#298). In this release, the .save_table() method has been improved, particularly when using the overwrite mode. If no rows are supplied, the table will now be truncated, ensuring consistency with the mock backend behavior. This change has been optimized for SQL-based backends, which now perform truncation as part of the insert for the first batch. Type hints on the abstract method have been updated to match the concrete implementations. Unit tests and integration tests have been updated to cover the new functionality, and new methods have been added to test the truncation behavior in overwrite mode. These improvements enhance the consistency and efficiency of the .save_table() method when using overwrite mode across different backends.
  • Updated databrickslabs/sandbox requirement to acceptance/v0.3.0 (#305). In this release, we have updated the requirement for the databrickslabs/sandbox package to version acceptance/v0.3.0 in the downstreams.yml file. This update is necessary to use the latest version of the package, which includes several bug fixes and dependency updates. The databrickslabs/sandbox package is used in the acceptance tests, which are run as part of the CI/CD pipeline. It provides a set of tools and utilities for developing and testing code in a sandbox environment. The changelog for this version includes the addition of install instructions, more git-related libraries, and the modification of the README to include information about how to use it with the databricks labs sandbox command. Specifically, the version of the databrickslabs/sandbox package used in the acceptance job has been updated from acceptance/v0.1.4 to acceptance/v0.3.0, allowing the integration tests to be run using the latest version of the package. The ignore conditions for this PR ensure that Dependabot will resolve any conflicts that may arise and can be manually triggered with the @dependabot rebase command.

Dependency updates:

  • Bump actions/checkout from 4.2.0 to 4.2.1 (#304).
  • Updated databrickslabs/sandbox requirement to acceptance/v0.3.0 (#305).
  • Bump databrickslabs/sandbox from acceptance/v0.3.0 to 0.3.1 (#307).
  • Bump actions/checkout from 4.2.1 to 4.2.2 (#310).
  • Bump databrickslabs/sandbox from acceptance/v0.3.1 to 0.4.2 (#315).

0.12.1

  • Bump actions/checkout from 4.1.7 to 4.2.0 (#295). In this version 4.2.0 release of the actions/checkout library, the team has added Ref and Commit outputs, which provide the ref and commit that were checked out, respectively. The update also includes dependency updates to braces, minor-npm-dependencies, docker/build-push-action, and docker/login-action, all of which were automatically resolved by Dependabot. These updates improve compatibility and stability for users of the library. This release is a result of contributions from new team members @​yasonk and @​lucacome. Users can find a detailed commit history, pull requests, and release notes in the associated links. The team strongly encourages all users to upgrade to this new version to access the latest features and improvements.
  • Set catalog on SchemaDeployer to overwrite the default hive_metastore (#296). In this release, the default catalog for SchemaDeployer has been changed from hive_metastore to a user-defined catalog, allowing for more flexibility in deploying resources to different catalogs. A new dependency, databricks-labs-pytester, has been added with a version constraint of >=0.2.1, which may indicate the introduction of new testing functionality. The SchemaDeployer class has been updated to accept a catalog parameter and the tests for deploying and deleting schemas, tables, and views have been updated to reflect these changes. The test_deploys_schema, test_deploys_dataclass, and test_deploys_view tests have been updated to accept a inventory_catalog parameter, and the caplog fixture is used to capture log messages and assert that they contain the expected messages. Additionally, a new test function test_statement_execution_backend_overwrites_table has been added to the tests/integration/test_backends.py file to test the functionality of the StatementExecutionBackend class in overwriting a table in the database and retrieving the correct data. Issue #294 has been resolved, and progress has been made on issue #278, but issue #280 has been marked as technical debt and issue #287 is required for the CI to pass.

Dependency updates:

  • Bump actions/checkout from 4.1.7 to 4.2.0 (#295).

0.12.0

  • Added method to detect rows are written to the MockBackend (#292). In this commit, the MockBackend class in the 'backends.py' file has been updated with a new method, 'has_rows_written_for', which allows for differentiation between a table that has never been written to and one with zero rows. This method checks if a specific table has been written to by iterating over the table stubs in the _save_table attribute and returning True if the given full name matches any of the stub full names. Additionally, the class has been supplemented with the rows_written_for method, which takes a table name and mode as input and returns a list of rows written to that table in the given mode. Furthermore, several new test cases have been added to test the functionality of the MockBackend class, including checking if the has_rows_written_for method correctly identifies when there are no rows written, when there are zero rows written, and when rows are written after the first and second write operations. These changes improve the overall testing coverage of the project and aid in testing the functionality of the MockBackend class. The new methods are accompanied by documentation strings that explain their purpose and functionality.

0.11.0

  • Added filter spec implementation (#276). In this commit, a new FilterHandler class has been introduced to handle filter files with the suffix .filter.json, which can parse filter specifications in the header of the filter file and validate the filter columns and types. The commit also adds support for three types of filters: DATE_RANGE_PICKER, MULTI_SELECT, and DROPDOWN, which can be linked with multiple visualization widgets. Additionally, a FilterTile class has been added to the Tile class, which represents a filter tile in the dashboard and includes methods to validate the tile, create widgets, and generate filter encodings and queries. The DashboardMetadata class has been updated to include a new method get_datasets() to retrieve the datasets for the dashboard. These changes enhance the functionality of the dashboard by adding support for filtering data using various filter types and linking them with multiple visualization widgets, improving the customization and interactivity of the dashboard, and making it more user-friendly and efficient.
  • Bugfix: MockBackend wasn't mocking savetable properly when the mode is append (#289). This release includes a bugfix and enhancements for the MockBackend component, which is used to mock the SQLBackend. The .savetable() method failed to function as expected in append mode, writing all rows to the same table instead of accumulating them. This bug has been addressed, ensuring that rows accumulate correctly in append mode. Additionally, a new test function, test_mock_backend_save_table_overwrite(), has been added to demonstrate the corrected behavior of overwrite mode, showing that it now replaces only the existing rows for the given table while preserving other tables' contents. The type signature for .save_table() has been updated, restricting the mode parameter to accept only two string literals: "append" and "overwrite". The MockBackend behavior has been updated accordingly, and rows are now filtered to exclude any None or NULL values prior to saving. These improvements to the MockBackend functionality and test suite increase reliability when using the MockBackend as a testing backend for the system.
  • Changed filter spec to use YML instead of JSON (#290). In this release, the filter specification files have been converted from JSON to YAML format, providing a more human-readable format for the filter specifications. The schema for the filter file includes flags for column, columns, type, title, description, order, and id, with the type flag taking on values of DROPDOWN, MULTI_SELECT, or DATE_RANGE_PICKER. This change impacts the FilterHandler, is_filter method, and _from_dashboard_folder method, as well as relevant parts of the documentation. Additionally, the parsing methods have been updated to use yaml.safe_load instead of json.loads, and the is_filter method now checks for .filter.yml suffix. A new file, '00_0_date.filter.yml', has been added to the 'tests/integration/dashboards/filter_spec_basic' directory, containing a sample date filter definition. Furthermore, various tests have been added to validate filter specifications, such as checking for invalid type and both column and columns keys being present. These updates aim to enhance readability, maintainability, and ease of use for filter configuration.
  • Increase testing of generic types storage (#282). A new commit enhances the testing of generic types storage by expanding the test suite to include a list of structs, ensuring more comprehensive testing of the system. The Foo struct has been renamed to Nested for clarity, and two new structs, NestedWithDict and Nesting, have been added. The Nesting struct contains a Nested object, while NestedWithDict includes a string and an optional dictionary of strings. A new test case demonstrates appending complex types to a table by creating and saving a table with two rows, each containing a Nesting struct. The test then fetches the data and asserts the expected number of rows are returned, ensuring the proper functioning of the storage system with complex data types.
  • Minor Changes to avoid redundancy in code and follow code patterns (#279). In this release, we have made significant improvements to the dashboards.py file to make the code more concise, maintainable, and in line with the standard library's recommended usage. The export_to_zipped_csv method has undergone major changes, including the removal of the BytesIO module import and the use of StringIO for handling strings as files. The method no longer creates a separate ZIP file for the CSV files, instead using the provided export_path. Additionally, the method skips tiles that don't contain queries. We have also introduced a new method, dataclass_transform, which transforms a given dataclass into a new one with specific attributes and behavior. This method creates a new dataclass with a custom metaclass and adds a new method, to_dict(), which converts the instances of the new dataclass to dictionaries. These changes promote code reusability and reduce redundancy in the codebase, making it easier for software engineers to work with.

... (truncated)

Commits
  • 383bb80 Release v0.14.0 (#329)
  • 7ba1ca0 Fixed RuntimeBackend exception handling (#328)
  • 4f5ef74 Added nightly tests run at 4:45am UTC (#318)
  • 5bcb6cc Bump codecov/codecov-action from 4 to 5 (#319)
  • 69c6e97 Handle changes from Databricks Python SDK 0.37.0 (#320)
  • 48c287e Release v0.13.0 (#317)
  • 776e1cf Added escape_name function to escape individual SQL names and `escape_full_...
  • aae9ea1 Bump databrickslabs/sandbox from acceptance/v0.3.1 to 0.4.2 (#315)
  • 9aace9e Deprecate Row.as_dict() (#309)
  • 94ed7f0 Bump actions/checkout from 4.2.1 to 4.2.2 (#310)
  • Additional commits viewable in compare view


Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
sundarshankar89 commented 1 week ago

@dependabot ignore minor version

dependabot[bot] commented 1 week ago

Sorry, the command you entered is not valid for this pull request. Please check the syntax and try again.

Valid commands: For single dependency PRs, use commands like: @dependabot ignore this major version @dependabot ignore this minor version @dependabot ignore this dependency

sundarshankar89 commented 1 week ago

@dependabot ignore minor version

dependabot[bot] commented 1 week ago

Sorry, the command you entered is not valid for this pull request. Please check the syntax and try again.

Valid commands: For single dependency PRs, use commands like: @dependabot ignore this major version @dependabot ignore this minor version @dependabot ignore this dependency

sundarshankar89 commented 1 week ago

@dependabot ignore this minor versio

dependabot[bot] commented 1 week ago

OK, I won't notify you about databricks-labs-lsql again, unless you re-open this PR.