yhat / db.py

db.py is an easier way to interact with your databases
BSD 2-Clause "Simplified" License
1.22k stars 111 forks source link

Add capability to save and load cache of metadata #81

Closed rothnic closed 9 years ago

rothnic commented 9 years ago

This isn't quite ready, I'll let you know when it is. I'm just creating the pull request to discuss any potential issues with this.

The reasoning is that it can take quite some time with large databases to refresh the schema. This provides the option of saving the metadata (stores with the credentials, so tied to a profile). Then, when you load from profile, you can tell it to use the cache and it will be used for the first load. You can still manually refresh the schema if desired.

db.save_metadata('my_db')
db = DB(profile='my_db', cache=True)

The cache is created by building dict representations of each of the objects. This is stored in json and used to instantiate the objects if the cache is used.

rothnic commented 9 years ago

After working on this, I'm noticing the big time driver for refreshing the schema metadata is the foreign/ref key queries, which are performed individually for each table. This takes much longer than just running the query one time (still 5s per query for a large db I'm working with), then filtering down to what is relevant to that table. I'll work more on this issue, but I'm not sure if this would affect other database types, as I'm basing this off of using mysql.

rothnic commented 9 years ago

Ok, I now added a one time foreign/ref key query during schema refresh, which is then filtered and passed into Table. This reduced the refresh time for my large database from 10 minutes to 10 seconds.