criccomini / pymetastore

A Python Client for Hive Metastore
MIT License
9 stars 1 forks source link

Table Type seems to be ignore on ttypes.Table #37

Closed bandianconde closed 10 months ago

bandianconde commented 10 months ago

Hello,

I tried to use the object ttypes.Table as seen in tests/integration/test_metastore.py When I create a table using ttypes.Table and set tableType to be 'EXTERNAL_TABLE' , and then use get_table the tableType is returned as MANAGED_TABLE. ` with HMS.create(host='localhost', port=9083) as hm: table = ttypes.Tables( tableName='table_name', dbName='db_name', tableType='EXTERNAL_TABLE', ..... ) hm.client.create(table) new_table = hm.client.get_table('db_name', 'table_name') assert new_table.tableType == 'MANAGED_TABLE'

`

bandianconde commented 10 months ago

Hello @criccomini @agrueneberg , Do yo need more details ?

criccomini commented 10 months ago

Lemme take a look right now. :)

criccomini commented 10 months ago

@bandianconde, one point of clarification: are you setting tableType to EXTERNAL_TYPE or EXTERNAL_TABLE?

criccomini commented 10 months ago

Looks like this is a known issue:

https://github.com/recap-build/pymetastore/blob/main/tests/integration/test_metastore.py#L429

criccomini commented 10 months ago

So, some digging. The call stack is as follows:

test_metastore.py line 414:
    table = hms.get_table("test_db", "test_table")

metastore.py line 390:
        table: Table = self.client.get_table(database_name, table_name)

# At this point, we're in code-gen land and have no control over what's happening.
ThriftHiveMetastore.py line: 3271:
        self.send_get_table(dbname, tbl_name)

Later in get_table, starting at line 591, we have:

        if table.tableType is not None:
            if isinstance(table.tableType, str):
                table_type = table.tableType
            else:
                raise TypeError(
                    f"Expected tableType to be str, got {type(table.tableType)}"
                )
        else:
            raise TypeError(
                f"Expected tableType to be str, got {type(table.tableType)}"
            )

I have to believe that tableType is coming back as either None or EXTERNAL_TABLE. This seems like an issue with the metastore AFAICT.

Let me do a little more digging.

criccomini commented 10 months ago

@bandianconde, also, what metastore are you using? Hive? Glue?

And how was the table created? Via pymetastore? Or via some other means?

criccomini commented 10 months ago

This looks relevant: https://github.com/apache/hive/pull/1537

Unfortunately, it's not fixed and closed as stale.

bandianconde commented 10 months ago

@bandianconde, one point of clarification: are you setting tableType to EXTERNAL_TYPE or EXTERNAL_TABLE?

EXTERNAL_TABLE it was a typo sorry

criccomini commented 10 months ago

@bandianconde, also, what metastore are you using? Hive? Glue?

And how was the table created? Via pymetastore? Or via some other means?

@bandianconde ^

bandianconde commented 10 months ago

@bandianconde, also, what metastore are you using? Hive? Glue?

And how was the table created? Via pymetastore? Or via some other means?

I use hive. I bypass pymetastore and use directly ttypes.Table as show in the tests

criccomini commented 10 months ago

What version of HMS?

bandianconde commented 10 months ago

Version 3.0.0

bandianconde commented 10 months ago

@criccomini I have seen here https://github.com/quintoandar/hive-metastore-client/blob/main/hive_metastore_client/hive_metastore_client.py#L204 that they had the same problem and fix it with by adding table table.parameters = {"EXTERNAL": "TRUE"} before creating the table. I tested it and it's working fine

criccomini commented 10 months ago

Nice. That's a good find.

I'm first upgrading the tests to use Hive 3.1.3. I want to see if HMS fixes the issue in a future version. HMS 3.0 is quite old.

criccomini commented 10 months ago

OK, just verified the issue exists even for HMS 3.1.3 Let's move forward with the table.paramaters patch you suggest.

criccomini commented 10 months ago

Working on fix: https://github.com/recap-build/pymetastore/pull/38

bandianconde commented 10 months ago

Thanks, I have seen your PR, it should be fine. However it looks like on some tests you need to add this new parameter as expected

criccomini commented 10 months ago

Yep workin on it =)

criccomini commented 10 months ago

Merged! No new release is needed since this is just a change to the way users should invoke create_table. LMK if you want a release anyway.

bandianconde commented 10 months ago

On my side I don't use a new release as use directly the ttypes.Table. Thanks !!