etianen / django-watson

Full-text multi-table search application for Django. Easy to install and use, with good performance.
BSD 3-Clause "New" or "Revised" License
1.21k stars 129 forks source link

buildwatson does not add in batches #225

Closed dustinrb closed 6 years ago

dustinrb commented 7 years ago

I am trying to make a large table (>1 million objects) searchable; however, the buildwatson management command times out and no SearchEntries are created.

It looks like Watson is configured to add entries in batches of 100 and _build_save_search_entries is iterating that way, but Postgres tries to add these in one über transaction. If I open a Django shell and run rebuild_index_for_model manually, it works as expected adding entries over time.

I'm suspecting this is an issue with Django forcing management commands to use a single DB transaction, but I'm not sure how to override this.

I'm using django-watson 1.4.2 on Django 1.8.18. I'm running on macOS High Sierra and Moby Linux

dustinrb commented 7 years ago

I missed the transaction.atomic() decorator in buildwatson.py.

Is there a particular reason why this command is setup this way or can the decorator be removed? Would it be beneficial to add an option to enable/disable a single atomic transaction?

etianen commented 7 years ago

How do you mean, "the buildwatson management command times out"?

Database commands do not time out, normall.

On 28 November 2017 at 19:36, Dustin Broderick notifications@github.com wrote:

I missed the transaction.atomic() decorator in buildwatson.py.

Is there a particular reason why this command is setup this way or can the decorator be removed? Would it be beneficial to add an option to enable/disable a single atomic transaction?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/etianen/django-watson/issues/225#issuecomment-347639019, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJFCCCnkKfsEDz5_mcceZ5SFXtPPh8Mks5s7GDSgaJpZM4Qtvql .

dustinrb commented 7 years ago

I cannot remember the exact error, but I tried to build the index twice and Django threw an exception after a few hours of work. I can recreate the exception, but it will take me a few hours tomorrow.

I would expect if items are being processed in batches, they would show up continuously throughout the indexing process.

dustinrb commented 7 years ago

Removing the transaction.atomic() makes buildwatson behave as I would expect, though I'm not finished testing it (I can tell you tomorrow how it went).

https://github.com/dustinrb/django-watson/tree/non-atomic

etianen commented 7 years ago

I don't think that removing the transaction is a good idea.

If the command aborts half way through, you're going to have to run it again anyway. I don't think a transaction makes the command any more likely to fail half way through.

On 28 November 2017 at 20:50, Dustin Broderick notifications@github.com wrote:

Removing the transaction.atomic() makes buildwatson behave as I would expect, though I'm not finished testing it (I can tell you tomorrow how it went).

https://github.com/dustinrb/django-watson/tree/non-atomic http://url

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/etianen/django-watson/issues/225#issuecomment-347659151, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJFCEbHfCbbXMDsQvoeHgLnlxlo0thVks5s7HILgaJpZM4Qtvql .

dustinrb commented 7 years ago

I was not intending to use a partially indexed database for search. I tried indexing the my database with and without the transaction and it failed with a OperationalError: SSL SYSCALL error if I had the transaction enabled (trace included below). It successfully indexed once I disable the transaction.

Error trace with the transaction:

Traceback (most recent call last):
  File "./manage.py", line 14, in <module>
    execute_from_command_line(sys.argv)
  File "/usr/lib/python2.7/site-packages/django/core/management/__init__.py", line 354, in execute_from_command_line
    utility.execute()
  File "/usr/lib/python2.7/site-packages/django/core/management/__init__.py", line 346, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/usr/lib/python2.7/site-packages/django/core/management/base.py", line 394, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/usr/lib/python2.7/site-packages/django/core/management/base.py", line 445, in execute
    output = self.handle(*args, **options)
  File "/usr/lib/python2.7/site-packages/django/utils/decorators.py", line 145, in inner
    return func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/watson/management/commands/buildwatson.py", line 142, in handle
    refreshed_model_count += rebuild_index_for_model(model, engine_slug, verbosity)
  File "/usr/lib/python2.7/site-packages/watson/management/commands/buildwatson.py", line 61, in rebuild_index_for_model
    _bulk_save_search_entries(iter_search_entries())
  File "/usr/lib/python2.7/site-packages/watson/search.py", line 206, in _bulk_save_search_entries
    search_entry_batch = list(islice(search_entries, 0, batch_size))
  File "/usr/lib/python2.7/site-packages/watson/management/commands/buildwatson.py", line 40, in iter_search_entries
    for search_entry in search_engine_._update_obj_index_iter(obj):
  File "/usr/lib/python2.7/site-packages/watson/search.py", line 491, in _update_obj_index_iter
    update_count = search_entries.update(**search_entry_data)
  File "/usr/lib/python2.7/site-packages/django/db/models/query.py", line 563, in update
    rows = query.get_compiler(self.db).execute_sql(CURSOR)
  File "/usr/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 1062, in execute_sql
    cursor = super(SQLUpdateCompiler, self).execute_sql(result_type)
  File "/usr/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 840, in execute_sql
    cursor.execute(sql, params)
  File "/usr/lib/python2.7/site-packages/django/db/backends/utils.py", line 79, in execute
    return super(CursorDebugWrapper, self).execute(sql, params)
  File "/usr/lib/python2.7/site-packages/django/db/backends/utils.py", line 64, in execute
    return self.cursor.execute(sql, params)
  File "/usr/lib/python2.7/site-packages/django/db/utils.py", line 98, in __exit__
    six.reraise(dj_exc_type, dj_exc_value, traceback)
  File "/usr/lib/python2.7/site-packages/django/db/backends/utils.py", line 64, in execute
    return self.cursor.execute(sql, params)
django.db.utils.OperationalError: SSL SYSCALL error: EOF detected
etianen commented 7 years ago

I think you should check your postgres settings. Look for any of the following being set:

statement_timeout idle_in_transaction_session_timeout lock_timeout

Are you using anything like pgbouncer?

Postgres is normally capable of keeping transactions open for months at a time.

On 30 November 2017 at 22:38, Dustin Broderick notifications@github.com wrote:

I was not intending to use a partially indexed database for search. I tried indexing the my database with and without the transaction and it failed with a OperationalError: SSL SYSCALL error if I had the transaction enabled (trace included below). It successfully indexed once I disable the transaction.

Error trace with the transaction:

Traceback (most recent call last): File "./manage.py", line 14, in execute_from_command_line(sys.argv) File "/usr/lib/python2.7/site-packages/django/core/management/init.py", line 354, in execute_from_command_line utility.execute() File "/usr/lib/python2.7/site-packages/django/core/management/init.py", line 346, in execute self.fetch_command(subcommand).run_from_argv(self.argv) File "/usr/lib/python2.7/site-packages/django/core/management/base.py", line 394, in run_from_argv self.execute(*args, cmd_options) File "/usr/lib/python2.7/site-packages/django/core/management/base.py", line 445, in execute output = self.handle(*args, *options) File "/usr/lib/python2.7/site-packages/django/utils/decorators.py", line 145, in inner return func(args, kwargs) File "/usr/lib/python2.7/site-packages/watson/management/commands/buildwatson.py", line 142, in handle refreshed_model_count += rebuild_index_for_model(model, engine_slug, verbosity) File "/usr/lib/python2.7/site-packages/watson/management/commands/buildwatson.py", line 61, in rebuild_index_for_model _bulk_save_search_entries(iter_search_entries()) File "/usr/lib/python2.7/site-packages/watson/search.py", line 206, in _bulk_save_search_entries search_entry_batch = list(islice(search_entries, 0, batch_size)) File "/usr/lib/python2.7/site-packages/watson/management/commands/buildwatson.py", line 40, in iter_search_entries for search_entry in searchengine._update_obj_index_iter(obj): File "/usr/lib/python2.7/site-packages/watson/search.py", line 491, in _update_obj_index_iter update_count = search_entries.update(**search_entry_data) File "/usr/lib/python2.7/site-packages/django/db/models/query.py", line 563, in update rows = query.get_compiler(self.db).execute_sql(CURSOR) File "/usr/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 1062, in execute_sql cursor = super(SQLUpdateCompiler, self).execute_sql(result_type) File "/usr/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 840, in execute_sql cursor.execute(sql, params) File "/usr/lib/python2.7/site-packages/django/db/backends/utils.py", line 79, in execute return super(CursorDebugWrapper, self).execute(sql, params) File "/usr/lib/python2.7/site-packages/django/db/backends/utils.py", line 64, in execute return self.cursor.execute(sql, params) File "/usr/lib/python2.7/site-packages/django/db/utils.py", line 98, in exit six.reraise(dj_exc_type, dj_exc_value, traceback) File "/usr/lib/python2.7/site-packages/django/db/backends/utils.py", line 64, in execute return self.cursor.execute(sql, params) django.db.utils.OperationalError: SSL SYSCALL error: EOF detected

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/etianen/django-watson/issues/225#issuecomment-348343980, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJFCK3rVOgKTwl4I8T7Acy8h38m78eKks5s7y5cgaJpZM4Qtvql .

dustinrb commented 6 years ago

I looked at my PG configuration and statement_timeout, idle_in_transaction_session_timeout, and lock_timeout are all set to 0. I was using a direct connection to the database.

I looked at the usage statistic for the indexing processes while indexing and the RAM usage was steadily ticking up. I didn't get a chance to run until failure, but I suspect that the DB server might be running out of memory and killing the connection.

I created PR #229 which adds the option for a non-atomic buildwatson. I see the value of indexing atomically, but I was only able to get my database to successfully finish after disabling atomicity.

etianen commented 6 years ago

The postgres docs seem to indicate that this doesn't happen...

...however, it seems to be happening to you, as well as various other people on the internet! 😛

...and I've observed similar behavior myself in the past, but assumed I was doing something dumb.

So I've merged your pull request. Thanks for taking the time to argue your position, and convince me!