trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
10.43k stars 3k forks source link

Release notes for 348 #6100

Closed martint closed 3 years ago

martint commented 3 years ago

Dain Sundstrom

kokosing commented 3 years ago
SPI
----
 * Change `SystemAccessControl#filterColumns` and `ConnectorAccessControl#filterColumns` method to accept a set of column names and return a set of visible column names. ({issues}`6084`))

6084

kokosing commented 3 years ago
JDBC Connectors
----
* Fix table metadata information caching. Previously cache did not store information if table exists and it was always verifing that in remote database.  ({issues}`6081)

6081

sopel39 commented 3 years ago
UI
--

* Fix invalid operator stats reporting in stage performance view. ({issue}`6114`)

https://github.com/prestosql/presto/pull/6114

sopel39 commented 3 years ago
General
-------

* Fix ``EXPLAIN ANALYZE`` for certain queries that contain broadcast join. ({issue}`6115`)

https://github.com/prestosql/presto/pull/6115

findepi commented 3 years ago
CLI
* Fix rendering of `row` values with unnamed fields. Previously they were printed using fake field names like `field0`, `field1`, etc. ({issue}`4587`)

JDBC
* Change representation of a `row` value. `ResultSet.getObject` now returns an instance of `io.prestosql.jdbc.Row` class, which better represents
  the returned value. Previously a `row` value was represented as a `Map` instance, with unnamed fields being named like `field0`, `field1`, etc. 
  You can access the previous behavior by invoking `getObject(column, Map.class)` on the `ResultSet` object. ({issue}`4588`)

4587 #4588

findepi commented 3 years ago
Kafka
* Allow writing `timestamp with time zone` values into columns using `milliseconds-since-epoch` or `seconds-since-epoch` JSON encoders. ({issue}`6074`)

6074 #5955

sopel39 commented 3 years ago
General
-------
* Fix planning failures for queries that contain filtered aggregations and outer joins. ({issue}`6141`)

https://github.com/prestosql/presto/pull/6141

kokosing commented 3 years ago
General
---
* Add support for OAuth2 authorization in Web UI. ({issue}`5355`)

5355

findepi commented 3 years ago
CLI
- Fix query progress reporting. ({issue}`6119`)

6119

sopel39 commented 3 years ago
General:
* Improve query performance by reducing worker to worker communication overhead. ({issue}`6126`)

https://github.com/prestosql/presto/pull/6126

sopel39 commented 3 years ago
General:
* Reduce memory pressure and improve performance of queries that contain join. ({issue}`6176`)

https://github.com/prestosql/presto/pull/6176

sopel39 commented 3 years ago
Hive:
* Reduce scheduling latency for queries where scanned files that are referenced by symlinks
  are located in same directory . ({issue}`6158`)

https://github.com/prestosql/presto/pull/6158

findepi commented 3 years ago
JDBC 
* Fix failure when reading a `timestamp` or `timestamp with time zone` value with second fraction greater than or equal to 999999999500 picoseconds. ({issue}`6147`)

6147 https://github.com/prestosql/presto/pull/6149

martint commented 3 years ago
* Fix incorrect results when correlated subquery in join contains aggregation functions such as `array_agg` or `checksum`. ({issue}`6145`)

https://github.com/prestosql/presto/issues/6145

martint commented 3 years ago
* Add support for `DISTINCT` clause in aggregations within correlated subqueries. ({issue}`5904`)

https://github.com/prestosql/presto/issues/5904

phd3 commented 3 years ago
## SPI Changes:

* Expose catalog names corresponding to the splits through the split completion event of the event listener. ({issue}`6006`)
## Hive Connector:

* Add deserializer class name to hive connector split information, which is exposed through the event listener. ({issue}`6006`)

6006

sopel39 commented 3 years ago
Hive
* Improve parallelism for queries where scanned files are referenced by symlinks. ({issue}`6213`)

https://github.com/prestosql/presto/pull/6213

findepi commented 3 years ago
JDBC
* Fix element representation in arrays returned from `ResultSet.getArray`, making it consistent with `ResultSet.getObject`.
  Previously the elements were represented using internal client representation (e.g. `String`). ({issue}`6048`)
* Fix `ResultSetMetaData.getColumnType` for `timestamp with time zone`. Previously the type was miscategorized as `java.sql.Types.TIMESTAMP`. ({issue}`6251`)

6251 https://github.com/prestosql/presto/pull/6208 #6048

findepi commented 3 years ago
JDBC
* Allow reading `timestamp with time zone` value as `ZonedDateTime` using `ResultSet.getObject(int column, Class<?> type)` method. ({issue}`307`)

307

findepi commented 3 years ago
CLI
* Fix failure when instance of `SphericalGeography` Geospatial type is returned to the client. ({issue}`6238`)

JDBC
* Fix failure when instance of `SphericalGeography` Geospatial type is returned in the `ResultSet`. ({issue}`6240`)

6238 https://github.com/prestosql/presto/pull/6240

losipiuk commented 3 years ago
General
* Fix duplicate call to event listener on query completion, if query failed early during preparation. ({issue}`6103`)

6103

phd3 commented 3 years ago
## Iceberg Connector

* Optimize predicate pushdown by using filters on non-partition columns during split generation and table scan. ({issue}`4932`)

4932

losipiuk commented 3 years ago
Hive
* Allow fallback to legacy Hive view translation logic via `hive.legacy-hive-view-translation` config property or `legacy_hive_view_translation` session property. ({issue}`6195 `)

6195 #5977

Question: should we mention CTEs here?

findepi commented 3 years ago
JDBC
* Fix failure when reading a `time` value with second fraction greater than or equal to 999999999500 picoseconds. ({issue}`6204`)

part of #6204, https://github.com/prestosql/presto/pull/6206

findepi commented 3 years ago
JDBC
* Represent `varbinary` value using hex string representation in `ResultSet.getString`. Previously the return value was useless, similar to `"B@2de82bf8"`. ({issue}`6247`)

6247 https://github.com/prestosql/presto/pull/6246

kokosing commented 3 years ago
JDBC Connectors
----
* Consider session properties when caching metadata. ({issue}`6167`)
* Fix cache entries invalidation, so entries are removed for all users not only the one that was running the query. ({issue}`6167`)

6167

electrum commented 3 years ago
# Iceberg Connector Changes

* Add support for Google Cloud Storage and Azure Storage. ({issue}`6186`)

6186

electrum commented 3 years ago
# Hive Connector Changes

* Allow configuring S3 endpoint in security mapping. ({issue}`3869`)

3869

electrum commented 3 years ago
# Hive Connector Changes

* Verify that data is in the correct bucket file when reading bucketed tables.
  This is enabled by default, as incorrect bucketing can cause incorrect query results,
  but can be disabled using the `hive.validate-bucketing` configuration property
  or the `validate_bucketing` session property. ({issue}`6012`)

6012

electrum commented 3 years ago
# Hive Connector Changes

* Reduce load on metastore when metastore caching is enabled with background refresh
  by removing background refresh for partition metadata and statistics. ({issue}`6101`)

6101

electrum commented 3 years ago
# Hive Connector Changes

* Decrease default refresh thread count for metastore cache. ({issue}`6156`)

6156

electrum commented 3 years ago
# Iceberg Connector Changes

* Remove extra file system stat call when opening files. ({issue}`6174`)

6174

electrum commented 3 years ago
# Hive Connector Changes

* Add support for S3 streaming uploads. Data is uploaded to S3 as it is written, rather
  than staging to a local temporary file. This feature is disabled by default and can be enabled
  using the `hive.s3.streaming.enabled` configuration property. ({issue}`3712`, {issue}`6201`)

3712, #6201

findepi commented 3 years ago
JDBC
*  Fix the value of the `DATA_TYPE` column for `time(p)` and `time(p) with time zone` in the result set returned from `DatabaseMetaData#getColumns`.  ({issue}`6307`)
* Report precision of the `time(p)`, `time(p) with time zone`,  `timestamp(p)` and `timestamp(p) with time zone` in the `DECIMAL_DIGITS` column
  in the result set returned from `DatabaseMetaData#getColumns`. ({issue}`6307`)

https://github.com/prestosql/presto/pull/6307

sopel39 commented 3 years ago
# General
* Fix query failure when views are accessed and current session does not
  specify default schema and catalog. ({issue}`6294`)

6294 #6296

findepi commented 3 years ago
JDBC
* Extend `PreparedStatement.setObject(int, Object, int)` to allow setting `time` and `timestamp` values with precision higher than nanoseconds. 
  To set `time` or `timestamp` value a `String` should be provided with correct SQL literal representation of the value, alongside with `java.sql.Types`
   type constant. ({issue}`6300`)

6300

dain commented 3 years ago
# General
* Improve performance of top-n queries. ({issue}`6072`)

6072

findepi commented 3 years ago
JDBC
* Accept `java.time.LocalDate` in `PreparedStatement.setObject(int, Object)`. ({issue}`6299`)

6299 #6301

findepi commented 3 years ago
General
* Fix incorrect query results when using `timestamp with time zone` constants with precision higher than 3 describing same point in time but in different zones. ({issue}`6318`)

6318 https://github.com/prestosql/presto/pull/6310

kokosing commented 3 years ago
General
* Support arbitrary query in `SHOW STATS`. ({issue}`3109`)
* Output of `SHOW STATS` is changed ({issue}`3109`): 
   * null fraction is normalized to `1` for tables with row count `0`, regardless of what the stats returned by the connector.
   * all relation columns are listed, even if duplicated.

3109