Closed rutsky closed 8 years ago
Thank you for suggestion. I'm very interested in your idea. Indeed, running initdb for every testcase is too waste. It is polite but unwelcome behavior.
My short investigation: Following code accelerate test cases by 2.3 times
pgsql = None
def setUpModule(self):
global pgsql
pgsql = testing.postgresql.Postgresql(auto_start=0)
pgsql.setup()
testing.postgresql.DEFAULT_SETTINGS['copy_data_from'] = pgsql.base_dir + '/data' # cache empty database
def tearDownModule(self):
testing.postgresql.DEFAULT_SETTINGS['copy_data_from'] = None # reset
pgsql.stop()
Result:
(before)$ nosetests
....................
----------------------------------------------------------------------
Ran 20 tests in 59.478s
OK
(after)$ nosetests
....................
----------------------------------------------------------------------
Ran 20 tests in 26.017s
OK
I will add this feature in next version. Let me think about the API to do that.
If you have any idea of API, please let me know.
About API: I suggest to add following keyword arguments to testing.postgresql.Postgresql
:
use_initdb_cache=None
.initdb_cache_dir=None
.If use_initdb_cache
is specified and copy_data_from
is not specified, then enable caching of initdb result.
If initdb_cache_dir
is specified, then use it as cache location. Otherwise use something like appdirs.user_cache_dir('testing.postgresql', 'Takeshi KOMIYA')
result as cache directory.
In response to your proposal, I'll add the factory class named testing.postgresql.PostgresqlFactory
.
For example:
import unittest
import testing.postgresql
# Generate Postgresql class which caches the generated database
Postgresql = testing.postgresql.PostgresqlFactory(use_initdb_cache=True)
def tearDownModule(self):
# clear cached database at end of tests
Postgresql.clear_cache()
class MyTestCase(unittest.TestCase):
def setUp(self):
# Use the generated Postgresql class instead of testing.postgresql.Postgresql
self.postgresql = Postgresql()
def tearDown(self):
self.postgresql.stop()
It makes testcases more efficient.
The factory class supports all options of testing.postgresql.Postgresql
.
So, you can use copy_data_from
option:
Postgresql = testing.postgresql.PostgresqlFactory(copy_data_from='/path/to/your/appdir`)
with Postgresql() as pgsql:
# ...
Probably, it is not instead of initdb_cache_dir
which you require.
but I do not want to add any dependencies or rules.
Is this helps you?
Yes, your solution will be useful, but can it be used without Postgresql.clear_cache()
?
If cache will be cleared on each test suite run, tests still will run on few seconds slower then they may run (if initdb
is cached "permanently" for specific version of testing.postgresql/PostgreSQL).
Yes, your solution will be useful, but can it be used without Postgresql.clear_cache()?
The answer is yes and no.
If you use use_initdb_cache=True
option, clear_cache()
is required (or automatically cleared by GC).
On the other hand, you do not have to call it if you use other options. Because any caches are not generated in that case.
I think the cache generated by a library should be removed by the library. It's basic principle. so I do not want to keep initdb cache till after the script ended.
If you want to keep the cache beyond the scripts, you should generate cache database manually, and then, use copy_data_from
option on test script.
Fortunately, PostgresqlFactory
takes copy_data_from
option and it is bypassed to each testing.postgresql.Postgresql
object.
It might be helpful you on refactoring testcases.
Thanks,
Finally, I renamed the options to cache_initialized_db
and on_initialized
.
Because the name of parameter should be used to other testing.*
packages to keep compatibility. I think initdb
is a part of PostgreSQL.
The feature will released soon. Thank you for great suggestion. :-)
On Ubuntu 14.04 with PostgreSQL 9.3 database initialization with initdb is the slowest part of testing.postgresql: it takes around 2.5 seconds to create data directory with default contents.
While it's possible to create PG data directory outside of testing.postgresql, cache it, copy for each test and pass to
testing.postgresql.Postgresql
withcopy_data_from
argument, it's a lot of work and requires to reimplementing of some of the testing.postgresql functionality (e.g. search ofinitdb
utility).I propose to implement caching of
initdb
result inside PostgreSQL and use it by default.This can be done pretty straightforward and I can prepare PR for this issue if you think my approach is satisfactory:
initdb -V
) and by testing.postgresql version.testing.postgresql.Postgresql
to disable cache. e.g.cache=False
. Intesting.postgresql.Postgresql
if cache is enabled: check if cache for the current version of initdb+testing.postgresql exists, if no, create it and fill it withinitdb
; copy cached directory content to a temporary PG data directory.