vmware-archive / validation-app-engine

**validation-app-engine** an agent based distributed workload resource validation and monitoring engine that lets various quality and reliability engineering teams to validate their products at large scale.
4 stars 3 forks source link

At scale writing to sqlite db is failing #1

Open vishalaga opened 5 years ago

vishalaga commented 5 years ago

I have around 3K Axon endpoints. Each endpoint is acting as 2 Servers + 750 clients.

Using SQLITE DB for the same. When all these 750 clients are writing in the DB, it is failing with the below exception:

May 12 10:00:13 ubuntu-1604 python[21376]: [parameters: ('4ba1263a-b1d8-4b9e-9582-75882b1a76d7', '197.0.85.23', '197.0.51.27', 9876, 9448.629, '(sqlite3.OperationalError) database is locked\n[SQL: INSERT INTO trafficrecord (id, src, dst, port, l', 0, 'UDP', 1557678143.7504, 1)] May 12 10:00:13 ubuntu-1604 python[21376]: (Background on this error at: http://sqlalche.me/e/e3q8) May 12 10:00:13 ubuntu-1604 python[21376]: Exception in thread Thread-17516: May 12 10:00:13 ubuntu-1604 python[21376]: Traceback (most recent call last): May 12 10:00:13 ubuntu-1604 python[21376]: File "/usr/lib/python2.7/threading.py", line 801, in bootstrap_inner May 12 10:00:13 ubuntu-1604 python[21376]: self.run() May 12 10:00:13 ubuntu-1604 python[21376]: File "/usr/lib/python2.7/threading.py", line 754, in run May 12 10:00:13 ubuntu-1604 python[21376]: self.target(*self.args, self.__kwargs) May 12 10:00:13 ubuntu-1604 python[21376]: File "/opt/axon/traffic/clients/clients.py", line 152, in ping May 12 10:00:13 ubuntu-1604 python[21376]: self.record(success=False, error=str(e)) May 12 10:00:13 ubuntu-1604 python[21376]: File "/opt/axon/traffic/clients/clients.py", line 140, in record May 12 10:00:13 ubuntu-1604 python[21376]: self._recorder.record_traffic(record) May 12 10:00:13 ubuntu-1604 python[21376]: File "/opt/axon/traffic/recorder.py", line 60, in record_traffic May 12 10:00:13 ubuntu-1604 python[21376]: self._repositery.create_record(_session, record.as_dict()) May 12 10:00:13 ubuntu-1604 python[21376]: File "/usr/lib/python2.7/contextlib.py", line 24, in exit May 12 10:00:13 ubuntu-1604 python[21376]: self.gen.next() May 12 10:00:13 ubuntu-1604 python[21376]: File "/opt/axon/db/local/init__.py", line 43, in session_scope May 12 10:00:13 ubuntu-1604 python[21376]: session.commit() May 12 10:00:13 ubuntu-1604 python[21376]: File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/scoping.py", line 162, in do May 12 10:00:13 ubuntu-1604 python[21376]: return getattr(self.registry(), name)(*args, **kwargs) May 12 10:00:13 ubuntu-1604 python[21376]: File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 1026, in commit May 12 10:00:13 ubuntu-1604 python[21376]: self.transaction.commit() May 12 10:00:13 ubuntu-1604 python[21376]: File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 493, in commit May 12 10:00:13 ubuntu-1604 python[21376]: self._prepare_impl() May 12 10:00:13 ubuntu-1604 python[21376]: File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 472, in _prepare_impl May 12 10:00:13 ubuntu-1604 python[21376]: self.session.flush() May 12 10:00:13 ubuntu-1604 python[21376]: File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 2451, in flush May 12 10:00:13 ubuntu-1604 python[21376]: self._flush(objects) May 12 10:00:13 ubuntu-1604 python[21376]: File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 2589, in _flush May 12 10:00:13 ubuntu-1604 python[21376]: transaction.rollback(_capture_exception=True) May 12 10:00:13 ubuntu-1604 python[21376]: File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/util/langhelpers.py", line 68, in exit May 12 10:00:13 ubuntu-1604 python[21376]: compat.reraise(exc_type, exc_value, exc_tb) May 12 10:00:13 ubuntu-1604 python[21376]: File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 2549, in _flush May 12 10:00:13 ubuntu-1604 python[21376]: flush_context.execute() May 12 10:00:13 ubuntu-1604 python[21376]: File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/unitofwork.py", line 422, in execute May 12 10:00:13 ubuntu-1604 python[21376]: rec.execute(self) May 12 10:00:13 ubuntu-1604 python[21376]: File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/unitofwork.py", line 589, in execute May 12 10:00:13 ubuntu-1604 python[21376]: uow, May 12 10:00:13 ubuntu-1604 python[21376]: File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/persistence.py", line 245, in save_obj May 12 10:00:13 ubuntu-1604 python[21376]: insert, May 12 10:00:13 ubuntu-1604 python[21376]: File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/persistence.py", line 1066, in _emit_insert_statements May 12 10:00:13 ubuntu-1604 python[21376]: c = cached_connections[connection].execute(statement, multiparams) May 12 10:00:13 ubuntu-1604 python[21376]: File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 988, in execute May 12 10:00:13 ubuntu-1604 python[21376]: return meth(self, multiparams, params) May 12 10:00:14 ubuntu-1604 python[21376]: File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/elements.py", line 287, in _execute_on_connection May 12 10:00:14 ubuntu-1604 python[21376]: return connection._execute_clauseelement(self, multiparams, params) May 12 10:00:14 ubuntu-1604 python[21376]: File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1107, in _execute_clauseelement May 12 10:00:14 ubuntu-1604 python[21376]: distilled_params, May 12 10:00:14 ubuntu-1604 python[21376]: File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1248, in _execute_context May 12 10:00:14 ubuntu-1604 python[21376]: e, statement, parameters, cursor, context May 12 10:00:14 ubuntu-1604 python[21376]: File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1466, in _handle_dbapi_exception May 12 10:00:14 ubuntu-1604 python[21376]: util.raise_from_cause(sqlalchemy_exception, exc_info) May 12 10:00:14 ubuntu-1604 python[21376]: File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/util/compat.py", line 383, in raise_from_cause May 12 10:00:14 ubuntu-1604 python[21376]: reraise(type(exception), exception, tb=exc_tb, cause=cause) May 12 10:00:14 ubuntu-1604 python[21376]: File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1244, in _execute_context May 12 10:00:14 ubuntu-1604 python[21376]: cursor, statement, parameters, context May 12 10:00:14 ubuntu-1604 python[21376]: File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/default.py", line 552, in do_execute May 12 10:00:14 ubuntu-1604 python[21376]: cursor.execute(statement, parameters) May 12 10:00:14 ubuntu-1604 python[21376]: OperationalError: (sqlite3.OperationalError) database is locked May 12 10:00:14 ubuntu-1604 python[21376]: [SQL: INSERT INTO trafficrecord (id, src, dst, port, latency, error, success, type, created, connected) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]

gitvipin commented 4 years ago

Are we running 2 Traffic Servers and 750 clients on each endpoint ? Can you give a bit more detail on topology ?

Are endpoints inside namespaces ? How many VIfs at each VM etc.