Hyperledger Aries Verifiable Credential Registry (VCR) is a set of application level software components designed to accelerate the adoption of trustworthy entity to entity communications.
This issue occurred yesterday to two pods (api-indy-cat-7-t87kh and api-indy-cat-7-z8bxq) in devex-von-bc-tob-prod during a window of platform maintenance. It appears the solrqueue instance lost its connection to the database while in the process of indexing credentials and the connection was never re-opened. This triggered an infinite loop where the items where re-queued for later processing, and when later processed the connection error would occur again.
The issue appears to have started around 10:00am. Platform services was alerted to the issue due to the issue around 7:00pm due to the size of the log files. Steven Barre killed the affected pods around 9:00pm, ending the issue. Other API pods have since processed and indexed credentials without issue.
Two things need to occur to address the issue:
Add alerting when items are re-queued due to unexpected exceptions.
Investigate the issue and see if there is a permanent solution.
The following log is an example of the looping error (Log growth for one of the pods was 8.6G in the last 90 minutes):
DEBUG 2020-05-27 02:20:00,041 index 1 140247316465408 Initializing TxnAwareSearchIndex ...
INFO 2020-05-27 02:20:00,041 solrqueue 1 140247316465408 Updating indexes for 1 row(s) from Solr queue: {3108637}
ERROR 2020-05-27 02:20:00,043 solrqueue 1 140247316465408 An unexpected exception was encountered while processing items from the Solr queue.
Traceback (most recent call last):
File "/home/indy/.pyenv/versions/3.6.8/lib/python3.6/site-packages/django/db/backends/base/base.py", line 235, in _cursor
return self._prepare_cursor(self.create_cursor(name))
File "/home/indy/.pyenv/versions/3.6.8/lib/python3.6/site-packages/django/db/backends/postgresql/base.py", line 223, in create_cursor
cursor = self.connection.cursor()
psycopg2.InterfaceError: connection already closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/indy/vcr_server/utils/solrqueue.py", line 121, in _drain
self.update(last_index, last_using, last_ids)
File "/home/indy/vcr_server/utils/solrqueue.py", line 148, in update
backend.update(index, rows)
File "/home/indy/.pyenv/versions/3.6.8/lib/python3.6/site-packages/haystack/backends/solr_backend.py", line 58, in update
for obj in iterable:
File "/home/indy/.pyenv/versions/3.6.8/lib/python3.6/site-packages/django/db/models/query.py", line 274, in __iter__
self._fetch_all()
File "/home/indy/.pyenv/versions/3.6.8/lib/python3.6/site-packages/django/db/models/query.py", line 1242, in _fetch_all
self._result_cache = list(self._iterable_class(self))
File "/home/indy/.pyenv/versions/3.6.8/lib/python3.6/site-packages/django/db/models/query.py", line 55, in __iter__
results = compiler.execute_sql(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size)
File "/home/indy/.pyenv/versions/3.6.8/lib/python3.6/site-packages/django/db/models/sql/compiler.py", line 1138, in execute_sql
cursor = self.connection.cursor()
File "/home/indy/.pyenv/versions/3.6.8/lib/python3.6/site-packages/django/db/backends/base/base.py", line 256, in cursor
return self._cursor()
File "/home/indy/.pyenv/versions/3.6.8/lib/python3.6/site-packages/django/db/backends/base/base.py", line 235, in _cursor
return self._prepare_cursor(self.create_cursor(name))
File "/home/indy/.pyenv/versions/3.6.8/lib/python3.6/site-packages/django/db/utils.py", line 89, in __exit__
raise dj_exc_value.with_traceback(traceback) from exc_value
File "/home/indy/.pyenv/versions/3.6.8/lib/python3.6/site-packages/django/db/backends/base/base.py", line 235, in _cursor
return self._prepare_cursor(self.create_cursor(name))
File "/home/indy/.pyenv/versions/3.6.8/lib/python3.6/site-packages/django/db/backends/postgresql/base.py", line 223, in create_cursor
cursor = self.connection.cursor()
django.db.utils.InterfaceError: connection already closed
INFO 2020-05-27 02:20:00,043 solrqueue 1 140247316465408 Requeueing items for later processing ...
DEBUG 2020-05-27 02:20:00,043 index 1 140247316465408 Initializing TxnAwareSearchIndex ...
INFO 2020-05-27 02:20:00,043 solrqueue 1 140247316465408 Updating indexes for 1 row(s) from Solr queue: {3272093}
ERROR 2020-05-27 02:20:00,046 solrqueue 1 140247316465408 An unexpected exception was encountered while processing items from the Solr queue.
Traceback (most recent call last):
File "/home/indy/.pyenv/versions/3.6.8/lib/python3.6/site-packages/django/db/backends/base/base.py", line 235, in _cursor
return self._prepare_cursor(self.create_cursor(name))
File "/home/indy/.pyenv/versions/3.6.8/lib/python3.6/site-packages/django/db/backends/postgresql/base.py", line 223, in create_cursor
cursor = self.connection.cursor()
psycopg2.InterfaceError: connection already closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/indy/vcr_server/utils/solrqueue.py", line 121, in _drain
self.update(last_index, last_using, last_ids)
File "/home/indy/vcr_server/utils/solrqueue.py", line 148, in update
backend.update(index, rows)
File "/home/indy/.pyenv/versions/3.6.8/lib/python3.6/site-packages/haystack/backends/solr_backend.py", line 58, in update
for obj in iterable:
File "/home/indy/.pyenv/versions/3.6.8/lib/python3.6/site-packages/django/db/models/query.py", line 274, in __iter__
self._fetch_all()
File "/home/indy/.pyenv/versions/3.6.8/lib/python3.6/site-packages/django/db/models/query.py", line 1242, in _fetch_all
self._result_cache = list(self._iterable_class(self))
File "/home/indy/.pyenv/versions/3.6.8/lib/python3.6/site-packages/django/db/models/query.py", line 55, in __iter__
results = compiler.execute_sql(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size)
File "/home/indy/.pyenv/versions/3.6.8/lib/python3.6/site-packages/django/db/models/sql/compiler.py", line 1138, in execute_sql
cursor = self.connection.cursor()
File "/home/indy/.pyenv/versions/3.6.8/lib/python3.6/site-packages/django/db/backends/base/base.py", line 256, in cursor
return self._cursor()
File "/home/indy/.pyenv/versions/3.6.8/lib/python3.6/site-packages/django/db/backends/base/base.py", line 235, in _cursor
return self._prepare_cursor(self.create_cursor(name))
File "/home/indy/.pyenv/versions/3.6.8/lib/python3.6/site-packages/django/db/utils.py", line 89, in __exit__
raise dj_exc_value.with_traceback(traceback) from exc_value
File "/home/indy/.pyenv/versions/3.6.8/lib/python3.6/site-packages/django/db/backends/base/base.py", line 235, in _cursor
return self._prepare_cursor(self.create_cursor(name))
File "/home/indy/.pyenv/versions/3.6.8/lib/python3.6/site-packages/django/db/backends/postgresql/base.py", line 223, in create_cursor
cursor = self.connection.cursor()
django.db.utils.InterfaceError: connection already closed
INFO 2020-05-27 02:20:00,046 solrqueue 1 140247316465408 Requeueing items for later processing ...
This issue occurred yesterday to two pods (
api-indy-cat-7-t87kh
andapi-indy-cat-7-z8bxq
) indevex-von-bc-tob-prod
during a window of platform maintenance. It appears the solrqueue instance lost its connection to the database while in the process of indexing credentials and the connection was never re-opened. This triggered an infinite loop where the items where re-queued for later processing, and when later processed the connection error would occur again.The issue appears to have started around 10:00am. Platform services was alerted to the issue due to the issue around 7:00pm due to the size of the log files. Steven Barre killed the affected pods around 9:00pm, ending the issue. Other API pods have since processed and indexed credentials without issue.
Two things need to occur to address the issue:
The following log is an example of the looping error (Log growth for one of the pods was 8.6G in the last 90 minutes):