trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
10.36k stars 2.98k forks source link

Release notes for 342 #5111

Closed martint closed 4 years ago

martint commented 4 years ago

Dain Sundstrom

findepi commented 4 years ago
General
* Fix query failure when lambda expression references a table column containing a dot. ({issue}`5087`)

https://github.com/prestosql/presto/pull/5087

sopel39 commented 4 years ago
Hive
* Add property (``hive.dynamic-filtering-probe-blocking-timeout``) for delaying table scans
  until dynamic partition pruning can be performed more efficiently. ({issue}`4991`)

https://github.com/prestosql/presto/pull/4991

sopel39 commented 4 years ago
General
* Improve performance of queries that use decimal type. ({issue}`4886`)

https://github.com/prestosql/presto/pull/4886

findepi commented 4 years ago
Atop Connector Changes
* Fix incorrect query results when query contains predicates on `start_time` or `end_time` column. ({issue}`5125`)

https://github.com/prestosql/presto/pull/5125

sopel39 commented 4 years ago
SPI
* Make dynamic filter futures resilient to cancellation. ({issue}`5099`)

https://github.com/prestosql/presto/pull/5099

sopel39 commented 4 years ago
General
* Improve query performance by adding support for dynamic filtering and dynamic
  partition pruning to semi-join relational operator. ({issue}`5017`)

https://github.com/prestosql/presto/pull/5017

losipiuk commented 4 years ago
Kafka
* Expose message headers as a ``_headers`` column of ``map(VARCHAR, array(VARBINARY))`` type. ({issue}`4462`)

https://github.com/prestosql/presto/pull/4462

sopel39 commented 4 years ago
SPI
* Add ``DynamicFilter#isAwaitable`` method that returns whether dynamic filter is not complete and can be
  awaited for via future. ({issue}`5043`)

https://github.com/prestosql/presto/pull/5043

losipiuk commented 4 years ago
PostgreSQL
* Extend type mapping to support variadic ``TIMESTAMP`` and ``TIMESTAMP WITH ZONE`` types. ({issue}`5124`, {issue}`5105`)

https://github.com/prestosql/presto/pull/5124 https://github.com/prestosql/presto/pull/5105

ebyhr commented 4 years ago
SQL Server
* Fix failure when inserting `NULL` to `VARBINARY` column. ({issue}`4846`)

https://github.com/prestosql/presto/pull/4846

losipiuk commented 4 years ago
Kafka
* Add write support for ``TIME``, ``TIME WITH TIME ZONE``, ``TIMESTAMP`` and ``TIMESTAMP WITH TIME ZONE`` 
  for Kafka connector when JSON encoder is in use. ({issue}`4743`)

https://github.com/prestosql/presto/pull/4743

sopel39 commented 4 years ago
General/SPI
* Enable connectors to wait for dynamic filters derived from replicated join before generating splits. ({issue}`4685`)

https://github.com/prestosql/presto/pull/4685

ebyhr commented 4 years ago
MySQL
* Improve performance of `INSERT` statement when MySQL instance isn't running with GTID mode. ({issue}`4995`)

https://github.com/prestosql/presto/pull/4995

sopel39 commented 4 years ago
General
* Improve dynamic partition pruning and query performance by reducing latency of dynamic filters collection. ({issue}`4988`)

https://github.com/prestosql/presto/pull/4988

findepi commented 4 years ago
Hive
* Disable matching the existing user and group of the table or partition when creating new files on HDFS.
  The functionality was added in 341 and is now disabled by default. You can enable it with `hive.fs.new-file-inherit-ownership`
  configuration property. ({issue}`5187`)

https://github.com/prestosql/presto/pull/5187

losipiuk commented 4 years ago
Hive
* Allow specifying what happens if data is inserted into existing Hive partition. 
  This can be done using ``hive.insert-existing-partitions-behavior`` config property. ({issue}`4999`)

https://github.com/prestosql/presto/pull/4999

sopel39 commented 4 years ago
General
* Improve join performance when cost-based optimizer has missing or inaccurate stats. ({issue}`5141`)

https://github.com/prestosql/presto/pull/5141

findepi commented 4 years ago
## SQL Server Connector Changes

* Improve performance of aggregation queries by computing aggregations within SQL Server database.
  Currently, the following aggregate functions are eligible for pushdown:
  ``count``,  ``min``, ``max``, ``sum`` and ``avg``. ({issue}`4139`)

https://github.com/prestosql/presto/issues/4139 https://github.com/prestosql/presto/pull/5196

mosabua commented 4 years ago

the SQL server connector changes from @findepi above should change and just link to the docs and have a short sentence like

* Add :ref:`aggregate function pushdown <sqlserver-pushdown>` as performance improvement ({issue}`4139`)

see https://github.com/prestosql/presto/pull/5245

losipiuk commented 4 years ago
Azure
* Add support for ABFS OAuth authentication ({issue}`5052`)

https://github.com/prestosql/presto/pull/5052

losipiuk commented 4 years ago
Kafka
* In JSON decoder drop decoding support for temporal types for nonsenical combinations of input-format-type/data-type.
  Following combination are no longer supported:
  - ``rfc2822``:  ``DATE``, ``TIME``, ``TIME WITH TIME ZONE``
  - ``milliseconds-since-epoch``: ``TIME WITH TIME ZONE``, ``TIMESTAMP WITH TIME ZONE``    
  - ``seconds-since-epoch``: ``TIME WITH TIME ZONE``, ``TIMESTAMP WITH TIME ZONE``    
  ({issue}`4743`)

https://github.com/prestosql/presto/pull/4743

dain commented 4 years ago
## Hive
* Add support for S3 encrypted files. ({issue}`2536`)
* Improve performance of reading small file in RCFile format. ({issue}`2536`)

2536

sopel39 commented 4 years ago
General
* Reduce latency for queries where broadcast join is used and broadcasted table is large. ({issue}`5237`)

https://github.com/prestosql/presto/pull/5237

findepi commented 4 years ago
Hive
* Support reading timestamp with microsecond or nanosecond precision. This can be enabled with `hive.timestamp-precision`
  connector configuration property. ({issue}`4953`)

https://github.com/prestosql/presto/pull/4953 part of https://github.com/prestosql/presto/issues/3977

electrum commented 4 years ago
# Hive Connector Changes

* Improve performance when reading `JSON` and `CSV` file formats. ({issue}`5142`)

5142

electrum commented 4 years ago
# Hive Connector Changes

* Improve planning time for queries with non-equality filters on
  partition columns when using the Glue metastore. ({issue}`5060`)

5060

electrum commented 4 years ago
# Iceberg Connector Changes

* Fix partition transforms for temporal columns for dates before 1970. ({issue}`5273`)

5273

sopel39 commented 4 years ago
General
* Allow collection of dynamic filters for joins with large build side using the
  `enable-large-dynamic-filters` configuration property or the `enable_large_dynamic_filters`
  session property.
  The existing configuration properties `dynamic-filtering-max-per-driver-row-count`,
  `dynamic-filtering-max-per-driver-size`, `dynamic-filtering-range-row-limit-per-driver`
  and their corresponding session properties are now defunct.
  When large dynamic filters are enabled, limits on size of dynamic filters can be configured
  for each join distribution type using the configuration properties
  `dynamic-filtering.large-broadcast.max-distinct-values-per-driver`,
  `dynamic-filtering.large-broadcast.max-size-per-driver` and
  `dynamic-filtering.large-broadcast.range-row-limit-per-driver` and their equivalent for partitioned
  join distribution type.
  Similarly, limits for dynamic filters when `enable-large-dynamic-filters` is not enabled
  can be configured using configuration properties like
  `dynamic-filtering.large-partitioned.max-distinct-values-per-driver`. ({issue}`5262`)

https://github.com/prestosql/presto/pull/5262

mosabua commented 4 years ago

This is way too long @sopel39 .. please move this into the docs and then link to it