xzkostyan / clickhouse-sqlalchemy

ClickHouse dialect for SQLAlchemy
https://clickhouse-sqlalchemy.readthedocs.io
Other
439 stars 132 forks source link

Nested maps, tuples, enums don't work #328

Open FraterCRC opened 3 months ago

FraterCRC commented 3 months ago

Describe the bug When you nest Tuple(Tuple) or Map(Enum) you get error

To Reproduce CREATE TABLE color_map ( id UInt32, colors Map(Enum('hello' = 1, 'world' = 2), String) ) ENGINE = Memory; And try to compile type. Expected behavior Should be Map(Enum, String), we get error.

Versions 0.2, but code still wrong in new versions python 3.10

FraterCRC commented 3 months ago

I see problem here:

elif spec.startswith('Tuple'):
            inner = spec[6:-1]
            coltype = self.ischema_names['_tuple']
            inner_types = [
                self._get_column_type(name, t.strip())
                for t in inner.split(',')
            ]
            return coltype(*inner_types)

        elif spec.startswith('Map'):
            inner = spec[4:-1]
            coltype = self.ischema_names['_map']
            inner_types = [
                self._get_column_type(name, t.strip())
                for t in inner.split(',', 1)
            ]
            return coltype(*inner_types)

It wont work good if the type is Tuple(Map(Type, Type), Type) because it will strip by ',' to ['Map(Type', 'Type)', 'Type']

FraterCRC commented 3 months ago

I wrote a function to split it and we use it in production.

@staticmethod
    def _split_inner(inner: str):
        start = 0
        result_split = []
        num_in_brackets = 0
        for idx, char in enumerate(inner):
            if char == ',' and not num_in_brackets:
                result_split.append(inner[start:idx].strip())
                start = idx + 1
            if char == ')' and num_in_brackets:
                num_in_brackets -= 1
            if char == '(':
                num_in_brackets += 1
        # asserting that there is something after last comma
        result_split.append(inner[start:].strip())
        return result_split