Teradata / presto

Teradata Distribution of Presto -- A Distributed SQL Query Engine for Big Data
http://www.teradata.com/presto
Apache License 2.0
94 stars 21 forks source link

Release notes for 0.179-t #693

Open cawallin opened 7 years ago

cawallin commented 7 years ago

Generated by git log 0.179..release-0.179-t --no-decorate --format="- [ ] %an (committed by %cn) %H %s" | sort + manually dividing up the sections

Akshat Nair

Alan Post

Amruta Gokhale

Andrii Rosa

Andrzej Fiedukowicz

Anton Petrov

Anu Sudarsan

Artur Gajowy

Brian Rickman

Christina Wallin

Grzegorz Kokosinski

Karol Sobczak

Lukasz Osipiuk

Maciej Grzybek

Piotr Findeisen

Piotr Nowojski

Rebecca Schlussel

Sanjay Sharma

Szymon Matejczyk

Wojciech Biela

cawallin commented 7 years ago
Security
--------
* File based system access control plugin that allows you to specify Kerberos principal matching rules.
* Secure internal cluster communication over HTTPS.
* ``ROLE`` support for the Hive connector, including ``CREATE ROLE``,
  ``DROP ROLE``, ``GRANT ROLE``, ``REVOKE ROLE``, ``SET ROLE``, ``SHOW CURRENT ROLES``,
  ``SHOW ROLES`` and ``SHOW ROLE GRANTS`` commands.

Miscellaneous
-------------
* Support prepared statements that are longer than 4K bytes.

Bug Fixes
————
* Fix query failure for ``CHAR`` functions :func:`trim`, :func:`rtrim`, and :func:`substr` when the return value would have trailing spaces under ``VARCHAR`` semantics.
losipiuk commented 7 years ago
General Changes
-------------
* SHOW STATS shows low and high value for table column  

Not sure if we want to list that. I am not convinced that SHOW STATS semantics should stay as we have it now. And I would not make users used to it too much. cc:@findepi

losipiuk commented 7 years ago
General Changes
-----------------
* Improve the performance of joins with only non-equality conditions by using
  a nested loops join instead of a hash join.

Hive Changes
----------
* Allow partitions without files for bucketed tables (via hive.empty-bucketed-partitions.enabled)
* Allow multiple files per bucket for bucketed tables (via hive.multi-file-bucketing.enabled). Same number of files per each bucket is required. File names must match Hive naming convention.

Bug Fixes
--------------
* Fix incorrect results when performing comparisons between values of approximate
  data types (``REAL``, ``DOUBLE``) and columns of certain exact numeric types
  (``INTEGER``, ``BIGINT``, ``DECIMAL``).
maciejgrzybek commented 7 years ago
Bug Fixes
————
* Fix explain plan for tables partitioned on timestamp column
* Fix execution of several window functions on array and map types
Some window functions taking array or map types (e.g. approx_percentile) were not executing before that patch.

Instrumentation
————
* Add EXPLAIN ANALYZE VERBOSE mode in order to display low-level information about window functions execution
* Add information about rows distribution to EXPLAIN ANALYZE
rschlussel-zz commented 7 years ago
General Changes
-----------------
* Enable more join predicates to be pushed down to the source tables

Cost-Based Optimizer
----------------------
* V1 of cost-based join reordering. See :doc:`../optimizer/reorder-joins`
* Replace the distributed_joins property with the join_distribution_type session property or join-distribution-type config property.  Options are AUTOMATIC, REPARTITIONED, and REPLICATED.
* Replace the reorder_joins property with the join_reordering_strategy session property or optimizer.join-reordering-strategy config property.  Options are NONE, ELIMINATE_CROSS_JOINS, and COST_BASED.

Connectors
------------
* Add a TPC-DS connector for generating TPC-DS data on the fly
akshatnair commented 7 years ago
Cost-Based Optimizer
----------------------
* Determine join distribution type based on statistics

Documentation
--------------------
Query Optimizer
CLI options
petroav commented 7 years ago
Hive Changes
--------------
* Fix potential native memory leak when writing tables using RCFile.
findepi commented 7 years ago
General Changes
-----------------
* Improve spill support in aggregations
* Support spill in join
rschlussel-zz commented 7 years ago
Bug Fixes
---------
* Fix incorrect empty results for tables filtered on char(x), decimal, date, or timestamp partition columns.
amrutagokhale commented 7 years ago
Hive Changes
------------
* Add a configuration option ``hive.create-non-managed-table-enabled`` using which one can disable creating external Hive tables (default value is ``true``)

General Changes
---------------
* Avoid potentially expensive computation on coordinator by offloading certain plan fragments to worker nodes
alandpost commented 6 years ago
Bug Fixes
---------
* Fix query failure when computing statistics on an unpartitioned table in CDH 5.11
cawallin commented 6 years ago

@arhimondr @sopel39 @kokosing @anusudarsan @ilfrin Please add release notes by tomorrow. There are a couple of things un-checked from both Artur and Piotr N, make sure the user-visible changes (including config name changes) are documented.

anusudarsan commented 6 years ago
Security
---------
* Support for Kerberos secured internal communication

Bug Fixes
-----------
* Fix incorrect results when `optimizer.optimize-metadata-queries` is enabled for queries involving aggregation over `TopN` and `Filter`.
kokosing commented 6 years ago
General
----------
 * make sure that spilled data is not corrupted after unspill 

Tpch connector
--------------
 * expose data column statistics

Tpcds connector
-------------
 * expose data (row and column) statistics
cawallin commented 6 years ago

@kokosing in what circumstances was spilled data corrupted? @findepi are there any particular bug fixes or performance differences or anything a user would see between the 167-t version of aggregation spill and 179-t version, or is it mostly refactoring to use revokable memory? @rschlussel and @akshatnair -- please give the actual config parameters and options

arhimondr commented 6 years ago
Security
-----------
* Support LDAP authentication for internal communication
* Support Kerberos authentication for internal communication
* Introduce role management syntax

General
-----------
* Implement distributed sort
* Threat fixed point literals as DECIMAL type by default

Hive Connector
---------------------
* Support role management for Hive connector
kokosing commented 6 years ago

@kokosing in what circumstances was spilled data corrupted?

This is just in case. For example when spilled data was modified by some external process (outside of Presto).

fiedukow commented 6 years ago
Cost-Based Optimizer
----------------------
 * Statistics calculated by Presto for stages of query plans can be seen in `EXPLAIN` queries results.
szymonm commented 6 years ago
General
--------
* Remove the `experimental.operator-memory-limit-before-spill` config
property and the `operator_memory_limit_before_spill` session property.
* Allow configuring the amount of memory that can be used for merging
spilled aggregation data from disk using the `experimental.aggregation-operator-unspill-memory-limit` config property
or the `aggregation_operator_unspill_memory_limit` session property.
findepi commented 6 years ago

@findepi are there any particular bug fixes or performance differences or anything a user would see between the 167-t version of aggregation spill and 179-t version, or is it mostly refactoring to use revokable memory?

@cawallin, bugfixes - no. Improvements? I guess it can now perform better, but I don't thing we have any numbers to back that.

petroav commented 6 years ago
Bug fixes
---------
* Skip unknown costs in EXPLAIN output.
* Fix query failure when ORDER BY expressions reference columns that are used in the GROUP BY clause by their fully-qualified name.

CLI changes
------------
* Fix an issue that would sometimes prevent queries from being cancelled when exiting from the pager.

SPI changes
------------
* Fix regression that broke serialization of SchemaTableName.
petroav commented 6 years ago
Bug fixes
-----------
* Handle GROUPING when aggregation expressions require implicit coercions.

Hive changes
-----------------
* Ignore partition bucketing if table is not bucketed. This allows dropping the bucketing from table metadata but leaving it for old partitions.
qin7972 commented 6 years ago

Hi, As for the commit "9d83483 : Consider local exchange in cost calculators". Could you tell me why you think so?