How to speed up the loop

OpenInformix / IfxPy

Informix native Python driver for scalable enterprise and IoT solutions.

Other

46 stars 22 forks source link

How to speed up the loop #8

Closed JonathanVandal closed 6 years ago

JonathanVandal commented 6 years ago

Hello,

I'm using your snippet :

rc = 0
while dictionary != False:
    rc = rc + 1
    print ("--  Record {0} --".format(rc))
    dictionary = IfxPy.fetch_both(stmt)

It takes exactly 14 seconds to loop on 36137 items. And If I remove dictionary = IfxPy.fetch_both(stmt) the loop is over just after 2 seconds.

Do you have an idea to speed up the loop ? Or this is the best I can get ?

jsagrera commented 6 years ago

Umm....not sure I'm get it. If you remove the fetch_both() from the code, the driver is not going to fetch anything at all from the database.

JonathanVandal commented 6 years ago

Hmm... Yeah but if I remove fetch both it will not working anymore. The difference with your snippet is than I use dictionary = IfxPy.fetch_assoc(stmt) it's ok ?

jsagrera commented 6 years ago

that's was my point. " ...And If I remove dictionary = IfxPy.fetch_both(stmt) the loop is over just after 2 seconds. ..." it will be faster as it doesn't do anything at all.

there is no much difference between fetch_both and fetch_assoc. only the key used to index the resultset.

JonathanVandal commented 6 years ago

So 14 seconds for 36137 items is normal ? I just want to know if I do something wrong

jsagrera commented 6 years ago

well, don't want to be picky but all depends on the table schema, data you have and more important the network (assuming it's going through the network). There are some parameters (driver side) like fetch buffer size which can be used to tune the data transfer: https://www.ibm.com/developerworks/community/blogs/idsteam/entry/fbs?lang=en

JonathanVandal commented 6 years ago

Testing download speed........................................
Download: 65.41 Mbit/s
Testing upload speed..................................................
Upload: 49.64 Mbit/s

I have a good network. And I can't tell you what is wrong with the schema because this informix database is owned by our customers and I'm just on Read Only :/ I can't see the database they give me a SQL query and I execute it

jsagrera commented 6 years ago

I didn't mean that there was something wrong in the schema, what I meant was that a "select first_colum ..." will be a lot faster than selecting from a table with 1000s of columns. The time depends on lots of factors so, without knowing the table and data distribution. we can't really say if 14 seconds is a good or a bad time when getting 32k rows as we don't know how big those rows are.

On the client side you can get some tracing (look at SQLIDEBUG here: https://books.google.co.uk/books?isbn=0738434701 ) which will give you some information in terms of performance.

JonathanVandal commented 6 years ago

Ok I understand. Yeah the SELECTis not really beautiful. There is a TRIMED subquery with lot of table jointure. There are 16 fields in the where condition to join the tables and I'm pretty sure that lot of fields don't have indexes (I don't know if indexes exist in Informix btw). So thank you for you answer I'll keep it like that :)

jsagrera commented 6 years ago

dbaccess is a simple query tool we include with CSDK, you may already have it in your $INFORMIXDIR. You can use that to run SQL and to get information about the database schema (table/indexes/etc). I mentioned SQLIDEBUG before because it has a nice feature ('stmt_stat') which can show who is taking most amount of time, server processing the query, network transferring or client processing. For what you see, maybe the query is not fully optimized (missing indexes) which can be why it takes longer. but 'longer' is quite a vague term ;)