catalyst-cooperative / pudl-usage-metrics

A dagster ETL for collecting and cleaning PUDL usage metrics.
MIT License
1 stars 0 forks source link

Update dagster-pandera requirement from ~=0.15.0 to >=0.15,<0.18 #85

Closed dependabot[bot] closed 1 year ago

dependabot[bot] commented 1 year ago

Updates the requirements on dagster-pandera to permit the latest version.

Changelog

Sourced from dagster-pandera's changelog.

1.1.2 (core) / 0.17.2 (libraries)

Bugfixes

  • In Dagit, assets that had been materialized prior to upgrading to 1.1.1 were showing as "Stale". This is now fixed.
  • Schedules that were constructed with a list of cron strings previously rendered with an error in Dagit. This is now fixed.
  • For users running dagit version >= 1.0.17 (or dagster-cloud) with dagster version < 1.0.17, errors could occur when hitting "Materialize All" and some other asset-related interactions. This has been fixed.

1.1.1 (core) / 0.17.1 (libraries)

Major Changes since 1.0.0 (core) / 0.16.0 (libraries)

Core

  • You can now create multi-dimensional partitions definitions for software-defined assets, through the MultiPartitionsDefinition API. In Dagit, you can filter and materialize certain partitions by providing ranges per-dimension, and view your materializations by dimension.
  • The new asset reconciliation sensor automatically materializes assets that have never been materialized or whose upstream assets have changed since the last time they were materialized. It works with partitioned assets too. You can construct it using build_asset_reconciliation_sensor.
  • You can now add a FreshnessPolicy to any of your software-defined assets, to specify how up-to-date you expect that asset to be. You can view the freshness status of each asset in Dagit, alert when assets are missing their targets using the @freshness_policy_sensor, and use the build_asset_reconciliation_sensor to make a sensor that automatically kick off runs to materialize assets based on their freshness policies.
  • You can now version your asset ops and source assets to help you track which of your assets are stale. You can do this by assigning op_version s to software-defined assets or observation_fn s to SourceAssets. When a set of assets is versioned in this way, their “Upstream Changed” status will be based on whether upstream versions have changed, rather than on whether upstream assets have been re-materialized. You can launch runs that materialize only stale assets.
  • The new @multi_asset_sensor decorator enables defining custom sensors that trigger based on the materializations of multiple assets. The context object supplied to the decorated function has methods to fetch latest materializations by asset key, as well as built-in cursor management to mark specific materializations as “consumed”, so that they won’t be returned in future ticks. It can also fetch materializations by partition and mark individual partitions as consumed.
  • RepositoryDefinition now exposes a load_asset_value method, which accepts an asset key and invokes the asset’s I/O manager’s load_input function to load the asset as a Python object. This can be used in notebooks to do exploratory data analysis on assets.
  • With the new asset_selection parameter on @sensor and SensorDefinition, you can now define a sensor that directly targets a selection of assets, instead of targeting a job.
  • When running dagit or dagster-daemon locally, environment variables included in a .env file in the form KEY=value in the same folder as the command will be automatically included in the environment of any Dagster code that runs, allowing you to easily use environment variables during local development.

Dagit

  • The Asset Graph has been redesigned to make better use of color to communicate asset health. New status indicators make it easy to spot missing and stale assets (even on large graphs!) and the UI updates in real-time as displayed assets are materialized.
  • The Asset Details page has been redesigned and features a new side-by-side UI that makes it easier to inspect event metadata. A color-coded timeline on the partitions view allows you to drag-select a time range and inspect the metadata and status quickly. The new view also supports assets that have been partitioned across multiple dimensions.
  • The new Workspace page helps you quickly find and navigate between all your Dagster definitions. It’s also been re-architected to load significantly faster when you have thousands of definitions.
  • The Overview page is the new home for the live run timeline and helps you understand the status of all the jobs, schedules, sensors, and backfills across your entire deployment. The timeline is now grouped by repository and shows a run status rollup for each group.

Integrations

  • dagster-dbt now supports generating software-defined assets from your dbt Cloud jobs.
  • dagster-airbyte and dagster-fivetran now support automatically generating assets from your ETL connections using load_assets_from_airbyte_instance and load_assets_from_fivetran_instance.
  • New dagster-duckdb integration: build_duckdb_io_manager allows you to build an I/O manager that stores and loads Pandas and PySpark DataFrames in DuckDB.

Database migration

  • Optional database schema migration, which can be run via dagster instance migrate:
    • Improves Dagit performance by adding database indexes which should speed up the run view as well as a range of asset-based queries.
    • Enables multi-dimensional asset partitions and asset versioning.

Breaking Changes and Deprecations

  • define_dagstermill_solid, a legacy API, has been removed from dagstermill. Use define_dagstermill_op or define_dagstermill_asset instead to create an op or asset from a Jupyter notebook, respectively.
  • The internal ComputeLogManager API is marked as deprecated in favor of an updated interface: CapturedLogManager. It will be removed in 1.2.0. This should only affect dagster instances that have implemented a custom compute log manager.

Dependency Changes

  • dagster-graphql and dagit now use version 3 of graphene

... (truncated)

Commits


Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
dependabot[bot] commented 1 year ago

OK, I won't notify you again about this release, but will get in touch when a new version is available. If you'd rather skip all updates until the next major or minor version, let me know by commenting @dependabot ignore this major version or @dependabot ignore this minor version. You can also ignore all major, minor, or patch releases for a dependency by adding an ignore condition with the desired update_types to your config file.

If you change your mind, just re-open this PR and I'll resolve any conflicts on it.