SETL-Framework / setl

A simple Spark-powered ETL framework that just works 🍺
Apache License 2.0
178 stars 32 forks source link

chore(deps): bump delta-core_2.12 from 1.1.0 to 2.0.0 #264

Closed dependabot[bot] closed 2 years ago

dependabot[bot] commented 2 years ago

Bumps delta-core_2.12 from 1.1.0 to 2.0.0.

Release notes

Sourced from delta-core_2.12's releases.

Delta Lake 2.0.0

We are excited to announce the release of Delta Lake 2.0.0 on Apache Spark 3.2.

The key features in this release are as follows.

  • Support Change Data Feed on Delta tables. Change Data Feed represents the row level changes between different versions of the table. When enabled, additional information is recorded regarding row level changes for every write operation on the table. See the documentation for more details.

  • Support Z-Order clustering of data to reduce the amount of data read. Z-Ordering is a technique to colocate related information in the same set of files. This data clustering allows column stats (released in Delta 1.2) to be more effective in skipping data based on filters in a query. See the documentation for more details.

  • Support for idempotent writes to Delta tables to enable fault-tolerant retry of Delta table writing jobs without writing the data multiple times to the table. See the documentation for more details.

  • Support for dropping columns in a Delta table as a metadata change operation. This command drops the column from metadata and not the column data in underlying files. See documentation for more details.

  • Support for dynamic partition overwrite. Overwrite only the partitions with data written into them at runtime. See documentation for details.

  • Experimental support for multi-part checkpoints to split the Delta Lake checkpoint into multiple parts to speed up writing the checkpoints and reading. See documentation for more details.

  • Python and Scala API support for OPTIMIZE file compaction and Z-order by.

  • Other notable changes

    • Improve the generated column data skipping by adding the support for skipping by nested column generated column
    • Improve the table schema validation by blocking the unsupported data types in Delta Lake.
    • Support creating a Delta Lake table with an empty schema.
    • Change the behavior of DROP CONSTRAINT to throw an error when the constraint does not exist. Before this version the command used to return silently.
    • Fix the symlink manifest generation when partition values contain space in them.
    • Fix an issue where incorrect commit stats are collected.
    • Support for SimpleAWSCredentialsProvider or TemporaryAWSCredentialsProvider in S3 multi-cluster write supported LogStore.
    • Fix an issue in generated columns that would not allow null columns in the insert DataFrame to be written even if the column was nullable.

Benchmark Framework Update

Independent of this release, we have improved the framework for writing large scala performance benchmarks (initial version added in version 1.2.0), we have added support for running benchmarks on Google Compute Platform using Google Dataproc (in addition to the existing support for EMR on AWS)

Credits

Adam Binford, Alkis Evlogimenos, Allison Portis, Ankur Dave, Bingkun Pan, Burak Yilmaz, Chang Yong Lik, Chen Qingzhi, Denny Lee, Eric Chang, Felipe Pessoto, Fred Liu, Fu Chen, Gaurav Rupnar, Grzegorz Kołakowski, Hussein Nagree, Jacek Laskowski, Jackie Zhang, Jiaan Geng, Jintao Shen, Jintian Liang, John O'Dwyer, Junyong Lee, Kam Cheung Ting, Karen Feng, Koert Kuipers, Lars Kroll, Liwen Sun, Lukas Rupprecht, Max Gekk, Michael Mengarelli, Min Yang, Naga Raju Bhanoori, Nick Grigoriev, Nick Karpov, Ole Sasse, Patrick Grandjean, Peng Zhong, Prakhar Jain, Rahul Shivu Mahadev, Rajesh Parangi, Ruslan Dautkhanov, Sabir Akhadov, Scott Sandre, Serge Rielau, Shixiong Zhu, Shoumik Palkar, Tathagata Das, Terry Kim, Tyson Condie, Venki Korukanti, Vini Jaiswal, Wenchen Fan, Xinyi, Yijia Cui, Yousry Mohamed

Delta Lake 2.0.0 Preview

We are excited to announce the preview release of Delta Lake 2.0.0 on Apache Spark 3.2. Similar to Apache Spark™, we have released Maven artifacts for both Scala 2.12 and Scala 2.13.

... (truncated)

Commits
  • ee945c0 Setting version to 2.0.0
  • 6726d85 Set version to 2.0.0-SNAPSHOT
  • 14fdcdd Update version in integration tests
  • 09147c3 [Delta] Accept LogStore conf keys with and without the "spark." prefix
  • 3996cea Fix incomplete SQL conf keys in DeltaErrors
  • 7e1acef [Delta] Metric tests for merge
  • c86b0f6 [ZOrder] Fast approach of interleaving bits
  • 91f0650 Add support of hadoop-aws s3a SimpleAWSCredentialsProvider to S3DynamoDBLogStore
  • 86ae53b Nullable columns should work when using generated columns
  • 2ff7cc7 Fix broken link in PROTOCOL.md file
  • Additional commits viewable in compare view


Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
codecov[bot] commented 2 years ago

Codecov Report

Merging #264 (6cbc9de) into master (a170c57) will increase coverage by 0.04%. The diff coverage is n/a.

@@            Coverage Diff             @@
##           master     #264      +/-   ##
==========================================
+ Coverage   97.92%   97.97%   +0.04%     
==========================================
  Files          63       63              
  Lines        2027     2027              
  Branches      125      125              
==========================================
+ Hits         1985     1986       +1     
+ Misses         42       41       -1     
Flag Coverage Δ
master_2.11_2.4 ?
master_2.12_3.2 ?
pr_2.11_2.3 91.95% <ø> (?)
pr_2.11_2.4 97.63% <ø> (?)
pr_2.12_2.4 97.61% <ø> (?)
pr_2.12_3.0 97.76% <ø> (?)
pr_2.12_3.2 97.76% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...o/github/setl/storage/SparkRepositoryBuilder.scala 98.30% <0.00%> (+1.69%) :arrow_up:

:mega: Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

dependabot[bot] commented 2 years ago

Superseded by #273.