Infinidat / infi.clickhouse_orm

A Python library for working with the ClickHouse database (https://clickhouse.yandex/)
BSD 3-Clause "New" or "Revised" License
412 stars 136 forks source link

We need to talk about select #109

Open noisywiz opened 5 years ago

noisywiz commented 5 years ago

Can we add the ability to retrieve dictionaries instead of objects? I am fully aware that we are talking about ORM. But if we want to work with a large piece of data, we often need to get it quickly. What are you thinking about about dict factory as option?

M1ha-Shvn commented 5 years ago

I've also seen this problem while large selects and inserts in https://github.com/carrotquest/django-clickhouse. I've implemented very effective selects and inserts on namedtuples. May be it can be useful here?

noisywiz commented 5 years ago

Namedtuples is really fast. But it's still not a dict)

ishirav commented 5 years ago

While it could be possible to send a query and return the results as a list of dicts, I'm not sure how much faster it would be - you'd still need to parse the TSV-formatted query result, and then convert strings to the appropriate data type for each column (int, float, datetime, etc.).

M1ha-Shvn commented 5 years ago

It would be. The first optimization I did was multi init for infi models. I got the same result (complete model instanses) 5 times faster for 100k records, just replacing single instance init with optimized multi init. I also parsed csv and called to_python of field classes.

M1ha-Shvn commented 5 years ago

https://github.com/Infinidat/infi.clickhouse_orm/issues/109#issuecomment-465164873 namedtuple has native _todict() method if you need it. But to my mind tuples are more usable, fast and memory safe

ishirav commented 5 years ago

@M1hacka - what do you mean by "multi init for infi models"?

M1ha-Shvn commented 5 years ago

It's not the last working version, as I don't use it now (replaced by tuples). But it may tell you what I meen. Single init is slow if called too many times. https://github.com/carrotquest/django-clickhouse/commit/6db5f2b5bb97ef4fa645fac62bab8d2d410fdfb8#diff-cc4c4048029ee67adc7b0eaf57cd170bR69