Python SQLAlchemy Inspector Error ( MySQL Dialect ) with greptime - TypeError: NullType() takes no arguments

atul-r commented 2 months ago

What type of bug is this?

Incorrect result

What subsystems are affected?

Standalone mode

Minimal reproduce step

I am encountering an issue when using SQLAlchemy's inspector to retrieve column details from a Greptime database using the MySQL dialect. Below is the code snippet I am using:

from sqlalchemy import create_engine, text, inspect

conn_string = "mysql://root:password@127.0.0.1:4002/somedb"

engine = create_engine(conn_string)

inspector = inspect(engine)
columns = inspector.get_columns('app_logs', 'somedb')
for column in columns:
    print(column)

When I run this code, I receive the following error:

TypeError: NullType() takes no arguments

Analysis: During reflection, SQLAlchemy issues a SHOW CREATE TABLE query to retrieve the table schema. It then parses the returned string using regex to extract the table name and column details.

Greptime returns the following CREATE TABLE statement for the table:

CREATE TABLE IF NOT EXISTS `app_logs` (
  `ts` TIMESTAMP(3) NOT NULL,
  `host` STRING NULL,
  `api_path` STRING NULL FULLTEXT WITH(analyzer = 'English', case_sensitive = 'false'),
  `log_level` STRING NULL,
  `log` STRING NULL FULLTEXT WITH(analyzer = 'English', case_sensitive = 'false'),
  TIME INDEX (`ts`),
  PRIMARY KEY (`host`, `log_level`)
)
ENGINE=mito
WITH(
  append_mode = 'true'
)

The issue seems to stem from the data types being returned in uppercase (e.g., STRING, TIMESTAMP) also some data types are different for example STRING, there is no data type of STRING in mysql. The MySQL dialect in SQLAlchemy does not recognize these uppercase data types and expects them in lowercase. As a result, it raises a TypeError.

I manually changed the data types to lowercase during debugging, and the reflection worked without any issues. This suggests that the data type casing is the root cause of the problem.

Request: Is there a way to modify Greptime so that it returns the data types in lowercase, or provide an output consistent with what the MySQL dialect in SQLAlchemy expects? This would enhance compatibility and prevent errors during schema reflection.

What did you expect to see?

{'name': 'ts', 'type': TIMESTAMP(fsp=3), 'default': None, 'comment': None, 'nullable': False}
{'name': 'host', 'type': VARCHAR(), 'default': None, 'comment': None, 'nullable': True}
{'name': 'api_path', 'type': VARCHAR(), 'default': None, 'comment': None, 'nullable': True}
{'name': 'log_level', 'type': VARCHAR(), 'default': None, 'comment': None, 'nullable': True}
{'name': 'log', 'type': VARCHAR(), 'default': None, 'comment': None, 'nullable': True}

What did you see instead?

TypeError: NullType() takes no arguments

What operating system did you use?

greptime/greptimedb:v0.9.1,

What version of GreptimeDB did you use?

0.9.1

Relevant log output and stack trace

c:\Users\user\Documents\project\others\alchemy-greptime\app.py:43: SAWarning: Did not recognize type 'TIMESTAMP' of column 'ts'
  columns = inspector.get_columns('app_logs', 'supersetexample')
Traceback (most recent call last):
  File "c:\Users\user\Documents\project\others\alchemy-greptime\app.py", line 43, in <module>
    columns = inspector.get_columns('app_logs', 'supersetexample')
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\user\AppData\Local\pypoetry\Cache\virtualenvs\alchemy-greptime-7gNpWF2k-py3.11\Lib\site-packages\sqlalchemy\engine\reflection.py", line 859, in get_columns
    col_defs = self.dialect.get_columns(
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<string>", line 2, in get_columns
  File "C:\Users\user\AppData\Local\pypoetry\Cache\virtualenvs\alchemy-greptime-7gNpWF2k-py3.11\Lib\site-packages\sqlalchemy\engine\reflection.py", line 97, in cache
    ret = fn(self, con, *args, **kw)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\user\AppData\Local\pypoetry\Cache\virtualenvs\alchemy-greptime-7gNpWF2k-py3.11\Lib\site-packages\sqlalchemy\dialects\mysql\base.py", line 2966, in get_columns
    parsed_state = self._parsed_state_or_create(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\user\AppData\Local\pypoetry\Cache\virtualenvs\alchemy-greptime-7gNpWF2k-py3.11\Lib\site-packages\sqlalchemy\dialects\mysql\base.py", line 3226, in _parsed_state_or_create
    return self._setup_parser(
           ^^^^^^^^^^^^^^^^^^^
  File "<string>", line 2, in _setup_parser
  File "<string>", line 2, in _setup_parser
  File "C:\Users\user\AppData\Local\pypoetry\Cache\virtualenvs\alchemy-greptime-7gNpWF2k-py3.11\Lib\site-packages\sqlalchemy\engine\reflection.py", line 97, in cache     
    ret = fn(self, con, *args, **kw)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\user\AppData\Local\pypoetry\Cache\virtualenvs\alchemy-greptime-7gNpWF2k-py3.11\Lib\site-packages\sqlalchemy\dialects\mysql\base.py", line 3262, in _setup_parser     _parser
    return parser.parse(sql, charset)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                      rse
  File "C:\Users\user\AppData\Local\pypoetry\Cache\virtualenvs\alchemy-greptime-7gNpWF2k-py3.11\Lib\site-packages\sqlalchemy\dialects\mysql\reflection.py", line 48, in parse                                                                                                                                                                        parse_column
    self._parse_column(line, state)
  File "C:\Users\user\AppData\Local\pypoetry\Cache\virtualenvs\alchemy-greptime-7gNpWF2k-py3.11\Lib\site-packages\sqlalchemy\dialects\mysql\reflection.py", line 284, in _parse_column
    type_instance = col_type(*type_args, **type_kw)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: NullType() takes no arguments

evenyag commented 2 months ago

Maybe we can create a PR to address this in SQLAlchemy. I found that the PostgreSQL dialect lowers the string before getting the schema. https://github.com/sqlalchemy/sqlalchemy/blob/6cf5e2a188fc5e337d22a098a5fe9a9fe10cc7e7/lib/sqlalchemy/dialects/postgresql/base.py#L3756

I'm also considering if we can provide a way to display the data type in lowercase.

atul-r commented 2 months ago

Maybe we can create a PR to address this in SQLAlchemy. I found that the PostgreSQL dialect lowers the string before getting the schema. https://github.com/sqlalchemy/sqlalchemy/blob/6cf5e2a188fc5e337d22a098a5fe9a9fe10cc7e7/lib/sqlalchemy/dialects/postgresql/base.py#L3756

I'm also considering if we can provide a way to display the data type in lowercase.

@evenyag But there are non matching data types as well. For example STRING, there is no data type of STRING in MySQL. possibly VARCHAR. This needs to come from greptime

v0y4g3r commented 2 months ago

Maybe we can create a PR to address this in SQLAlchemy. I found that the PostgreSQL dialect lowers the string before getting the schema. https://github.com/sqlalchemy/sqlalchemy/blob/6cf5e2a188fc5e337d22a098a5fe9a9fe10cc7e7/lib/sqlalchemy/dialects/postgresql/base.py#L3756 I'm also considering if we can provide a way to display the data type in lowercase.

But there are non matching data types as well. For example STRING, there is no data type of STRING in MySQL. possibly VARCHAR. This needs to come from greptime

Since PostgreSQL does not support SHOW statements, we can format the output to match MySQL

evenyag commented 2 months ago

Can we add a new method display_mysql_dialect() to the CreateTable and Column? https://github.com/GreptimeTeam/greptimedb/blob/27d9aa0f3be492e4f6e420876193b29f937f6157/src/sql/src/statements/create.rs#L235-L246

We need to map some data types to MySQL's types in Column. https://github.com/GreptimeTeam/greptimedb/blob/27d9aa0f3be492e4f6e420876193b29f937f6157/src/sql/src/statements/create.rs#L135-L148

Another way is to map the data type while creating the Column struct. https://github.com/GreptimeTeam/greptimedb/blob/27d9aa0f3be492e4f6e420876193b29f937f6157/src/query/src/sql/show_create_table.rs#L116-L119