perseas / Pyrseas

Provides utilities for Postgres database schema versioning.
https://perseas.github.io/
BSD 3-Clause "New" or "Revised" License
395 stars 67 forks source link

Deptrack performance #130

Closed jmafc closed 7 years ago

jmafc commented 9 years ago

The deptrack branch is over 90% complete and only four tests are failing (three functional, and one from dbobject). However, there has been a significant performance impact. Looking at the Travis-CI logs, the typical test run against about four minutes, two for Python 2.7 and a bit over two for Python 3.4, but the deptrack tests are taking about 24 minutes or more, also split more or less evenly between 2.7 and 3.4. While deptrack should improve the quality of Pyrseas, the six times performance impact will need to be investigated and hopefully addressed.

One potential culprit for the performance degradation appears to be the split_schema_obj function, in pyrseas/dbobject/__init__.py. It is being called multiple times for very simple types like integer and text. What follows is part of the calling sequence, that leads to split_schema_obj being called in TypeDict.find in pyrseas/dbobject/dbtype.py.

pyrseas/database.py:483: in to_map
    dbmap.update(self.db.schemas.to_map(self.db, opts))
pyrseas/dbobject/schema.py:324: in to_map
    schemas.update(self[sch].to_map(db, self, opts))
pyrseas/dbobject/schema.py:63: in to_map
    schobjs.append((obj, obj.to_map(db, dbschemas, opts)))
pyrseas/dbobject/table.py:242: in to_map
    tbl = self._base_map(db, opts.no_owner, opts.no_privs)
pyrseas/dbobject/__init__.py:342: in _base_map
    deps -= self.get_implied_deps(db)
pyrseas/dbobject/table.py:499: in get_implied_deps
    type = db.find_type(col.type)
pyrseas/database.py:186: in find_type
    rv = self.types.find(name)
pyrseas/dbobject/dbtype.py:444: in find
    schema, name = split_schema_obj(obj)

A possible improvement may be to short-circuit evaluation, perhaps in Table.get_implied_deps to check against a list of standard types, e.g., integer, smallint, text, etc. Since it is possible to define such types in another schema, it seems find_type and find should also be passed the table's schema as an argument so they can pass it to split_schema_obj (if allowed to get that far).

jmafc commented 9 years ago

@dvarrazzo, Daniele, please look at this and comment as needed. Thanks.

jmafc commented 7 years ago

This has been addressed mostly by change b124dc17bea39ee51f344ba49ce072ae95070d67 (see commit summary for timing details).