datahub-project / datahub

The Metadata Platform for your Data and AI Stack
https://datahubproject.io
Apache License 2.0
9.93k stars 2.94k forks source link

Materialized View is checked on 2.x Hive in integration tests #11894

Open Irillit opened 2 days ago

Irillit commented 2 days ago

Describe the bug The SQL in integration tests has the following line : CREATE MATERIALIZED VIEW db1.struct_test_view_materialized as select * from db1.struct_test;

I believe, it doesn't create a materialized view, as the docker-compose file uses the Hive of version 2.x, and according to the Hive documentation, Materialized Views were introduced in Hive 3.0.0.

In Hive 2.x the word "Materialized" is probably ignored. In 3.x or 4.x it will cause the container to fall, as you need to have transactional tables stored in specific format in order to use build the container.

To Reproduce In 2.x it passed even trough the functionality you test hasn't been introduced yet.

Steps to reproduce the behavior in 3.x:

  1. Run some Hive container with the version 3.x+
  2. Run the integration tests SQL file on that instance.
  3. You'll get an error Automatic rewriting for materialized view cannot be enabled if the materialized view uses non-transactional tables.

Expected behavior All the SQL statements in version 3.x+ should be completed successfully. In version 2.x there shouldn't be word "Materialized" as that feature wasn't introduced.

To solve that problem

  1. In 2.x the "Materialized" word should be removed.
  2. For 3.x+ you have two choices: 2.1 Do not check materialized view 2.2 Create a database with transactional tables, create a materialized view based on them.