RIPE-NCC / rpki-validator-3

RIPE NCC RPKI Validator 3
Other
63 stars 27 forks source link

Database space usage grows up unexpectly #79

Closed d4nys3k closed 5 years ago

d4nys3k commented 5 years ago

After upgrading to version 3.0-397, there's some problem causing to grow rpki-validator.h2.mv.db unexpectly and very quickly after few hours running; where database was purged before. I noted that in #61 already, but I think this is somewhat different issue. Database file has 13 GB in total (from wiki/documentation; 10GB should be sufficient for running RPKI validator).

df-pinpoint=1553241622,1553267122

Before disk space is filled-up, only similar errors are logged repeatedly:

Mar 22 14:30:07 rpki rpki-validator-3.sh: 2019-03-22 14:30:07,650 #011 WARN  #011 quartzScheduler_Worker-4 #011#011#011 o.h.e.j.s.SqlExceptionHelper #011#011 SQL Error: 23505, SQLState: 23505
Mar 22 14:30:07 rpki rpki-validator-3.sh: 2019-03-22 14:30:07,650 #011 ERROR #011 quartzScheduler_Worker-4 #011#011#011 o.h.e.j.s.SqlExceptionHelper #011#011 Unique index or primary key violation: "RPKI_OBJECT__SHA256_IDX ON PUBLIC.RPKI_OBJECT(SHA256) VALUES (X'fd46859469
cdc7d40827f8331e44a823b542392e5ab085d56d660efaa6f0beef', 186429)"; SQL statement:
Mar 22 14:30:07 rpki rpki-validator-3.sh: /* insert net.ripe.rpki.validator3.domain.RpkiObject */ insert into rpki_object (created_at, updated_at, version, authority_key_identifier, last_marked_reachable_at, serial_number, sha256, signing_time, type, id) values (?, ?, ?, 
?, ?, ?, ?, ?, ?, ?) [23505-197]
Mar 22 14:30:07 rpki rpki-validator-3.sh: 2019-03-22 14:30:07,650 #011 ERROR #011 quartzScheduler_Worker-4 #011#011#011 o.h.i.ExceptionMapperStandardImpl #011#011 HHH000346: Error during managed flush [org.hibernate.exception.ConstraintViolationException: could not execut
e statement]

These errors were referenced in #77 as "normal".

After threre's no free space, application crashes.

Mar 22 14:30:43 rpki rpki-validator-3.sh: 2019-03-22 14:30:43,854 #011 WARN  #011 quartzScheduler_Worker-3 #011#011#011 o.h.e.j.s.SqlExceptionHelper #011#011 SQL Error: 50000, SQLState: HY000
Mar 22 14:30:43 rpki rpki-validator-3.sh: 2019-03-22 14:30:43,854 #011 ERROR #011 quartzScheduler_Worker-3 #011#011#011 o.h.e.j.s.SqlExceptionHelper #011#011 General error: "java.lang.IllegalStateException: This store is closed [1.4.197/4]" [50000-197]
Mar 22 14:30:43 rpki rpki-validator-3.sh: 2019-03-22 14:30:43,860 #011 WARN  #011 quartzScheduler_Worker-7 #011#011#011 o.h.e.j.s.SqlExceptionHelper #011#011 SQL Error: 50000, SQLState: HY000
Mar 22 14:30:43 rpki rpki-validator-3.sh: 2019-03-22 14:30:43,860 #011 ERROR #011 quartzScheduler_Worker-7 #011#011#011 o.h.e.j.s.SqlExceptionHelper #011#011 General error: "java.lang.IllegalStateException: This store is closed [1.4.197/4]"; SQL statement:
Mar 22 14:30:43 rpki rpki-validator-3.sh: /* select rpkiObject
Mar 22 14:30:43 rpki rpki-validator-3.sh: from RpkiObject rpkiObject
Mar 22 14:30:43 rpki rpki-validator-3.sh: where rpkiObject.sha256 = (?1) */ select rpkiobject0_.id as id1_5_, rpkiobject0_.created_at as created_2_5_, rpkiobject0_.updated_at as updated_3_5_, rpkiobject0_.version as version4_5_, rpkiobject0_.authority_key_identifier as authorit5_5_, rpkiobject0_.last_marked_reachable_at as last_mar6_5_, rpkiobject0_.serial_number as serial_n7_5_, rpkiobject0_.sha256 as sha8_5_, rpkiobject0_.signing_time as signing_9_5_, rpkiobject0_.type as type10_5_ from rpki_object rpkiobject0_ where rpkiobject0_.sha256=? [50000-197]
Mar 22 14:30:43 rpki rpki-validator-3.sh: 2019-03-22 14:30:43,861 #011 WARN  #011 quartzScheduler_Worker-10 #011#011#011 o.h.e.j.s.SqlExceptionHelper #011#011 SQL Error: 90030, SQLState: 90030
Mar 22 14:30:43 rpki rpki-validator-3.sh: 2019-03-22 14:30:43,861 #011 ERROR #011 quartzScheduler_Worker-10 #011#011#011 o.h.e.j.s.SqlExceptionHelper #011#011 File corrupted while reading record: "nio:/var/lib/rpki-validator-3/db/rpki-validator.h2.mv.db". Possible solution: use the recovery tool; SQL statement:
Mar 22 14:30:43 rpki rpki-validator-3.sh: /* insert collection row net.ripe.rpki.validator3.domain.CertificateTreeValidationRun.validatedObjects */ insert into validation_run_validated_objects (validation_run_id, rpki_object_id) values (?, ?) [90030-197]
Mar 22 14:30:43 rpki rpki-validator-3.sh: 2019-03-22 14:30:43,862 #011 ERROR #011 quartzScheduler_Worker-3 #011#011#011 o.h.i.ExceptionMapperStandardImpl #011#011 HHH000346: Error during managed flush [org.hibernate.exception.GenericJDBCException: could not execute statement]
Mar 22 14:30:43 rpki rpki-validator-3.sh: 2019-03-22 14:30:43,863 #011 ERROR #011 quartzScheduler_Worker-10 #011#011#011 o.h.i.ExceptionMapperStandardImpl #011#011 HHH000346: Error during managed flush [org.hibernate.exception.GenericJDBCException: could not execute statement]
Mar 22 14:30:43 rpki rpki-validator-3.sh: 2019-03-22 14:30:43,863 #011 WARN  #011 quartzScheduler_Worker-1 #011#011#011 o.h.e.j.s.SqlExceptionHelper #011#011 SQL Error: 50000, SQLState: HY000
Mar 22 14:30:43 rpki rpki-validator-3.sh: 2019-03-22 14:30:43,863 #011 ERROR #011 quartzScheduler_Worker-1 #011#011#011 o.h.e.j.s.SqlExceptionHelper #011#011 General error: "java.lang.IllegalStateException: This store is closed [1.4.197/4]" [50000-197]
Mar 22 14:30:43 rpki rpki-validator-3.sh: 2019-03-22 14:30:43,873 #011 WARN  #011 quartzScheduler_Worker-3 #011#011#011 o.h.e.j.s.SqlExceptionHelper #011#011 SQL Error: 50000, SQLState: HY000
Mar 22 14:30:43 rpki rpki-validator-3.sh: 2019-03-22 14:30:43,873 #011 ERROR #011 quartzScheduler_Worker-3 #011#011#011 o.h.e.j.s.SqlExceptionHelper #011#011 General error: "java.lang.IllegalStateException: This store is closed [1.4.197/4]"; SQL statement:
Mar 22 14:30:43 rpki rpki-validator-3.sh: ROLLBACK [50000-197]
Mar 22 14:30:43 rpki rpki-validator-3.sh: 2019-03-22 14:30:43,878 #011 WARN  #011 quartzScheduler_Worker-7 #011#011#011 o.h.e.j.s.SqlExceptionHelper #011#011 SQL Error: 90098, SQLState: 90098
Mar 22 14:30:43 rpki rpki-validator-3.sh: 2019-03-22 14:30:43,878 #011 ERROR #011 quartzScheduler_Worker-7 #011#011#011 o.h.e.j.s.SqlExceptionHelper #011#011 The database has been closed [90098-197]
Mar 22 14:30:43 rpki rpki-validator-3.sh: 2019-03-22 14:30:43,894 #011 WARN  #011 quartzScheduler_Worker-10 #011#011#011 o.h.e.j.s.SqlExceptionHelper #011#011 SQL Error: 90098, SQLState: 90098
Mar 22 14:30:43 rpki rpki-validator-3.sh: 2019-03-22 14:30:43,895 #011 ERROR #011 quartzScheduler_Worker-10 #011#011#011 o.h.e.j.s.SqlExceptionHelper #011#011 The database has been closed [90098-197]
Mar 22 14:30:43 rpki rpki-validator-3.sh: 2019-03-22 14:30:43,898 #011 ERROR #011 quartzScheduler_Worker-1 #011#011#011 o.s.t.i.TransactionInterceptor #011#011 Application exception overridden by rollback exception
Mar 22 14:30:43 rpki rpki-validator-3.sh: org.springframework.orm.jpa.JpaSystemException: could not execute statement; nested exception is org.hibernate.exception.GenericJDBCException: could not execute statement
Mar 22 14:30:43 rpki rpki-validator-3.sh: at org.springframework.orm.jpa.vendor.HibernateJpaDialect.convertHibernateAccessException(HibernateJpaDialect.java:312)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at org.springframework.orm.jpa.vendor.HibernateJpaDialect.translateExceptionIfPossible(HibernateJpaDialect.java:223)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at org.springframework.orm.jpa.AbstractEntityManagerFactoryBean.translateExceptionIfPossible(AbstractEntityManagerFactoryBean.java:527)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at org.springframework.dao.support.ChainedPersistenceExceptionTranslator.translateExceptionIfPossible(ChainedPersistenceExceptionTranslator.java:61)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at org.springframework.dao.support.DataAccessUtils.translateIfNecessary(DataAccessUtils.java:242)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at org.springframework.dao.support.PersistenceExceptionTranslationInterceptor.invoke(PersistenceExceptionTranslationInterceptor.java:153)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:294)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:98)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:688)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at net.ripe.rpki.validator3.adapter.jpa.JPAValidationRuns$$EnhancerBySpringCGLIB$$13351c5f.removeOldValidationRuns()
Mar 22 14:30:43 rpki rpki-validator-3.sh: at net.ripe.rpki.validator3.domain.cleanup.ValidationRunCleanupService.cleanupValidationRuns(ValidationRunCleanupService.java:64)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at net.ripe.rpki.validator3.domain.cleanup.ValidationRunCleanupService$$FastClassBySpringCGLIB$$1e07a3db.invoke()
Mar 22 14:30:43 rpki rpki-validator-3.sh: at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:746)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:294)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:98)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:688)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at net.ripe.rpki.validator3.domain.cleanup.ValidationRunCleanupService$$EnhancerBySpringCGLIB$$745f69d6.cleanupValidationRuns()
Mar 22 14:30:43 rpki rpki-validator-3.sh: at net.ripe.rpki.validator3.background.ValidationRunCleanupJob.execute(ValidationRunCleanupJob.java:47)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573)
Mar 22 14:30:43 rpki rpki-validator-3.sh: Caused by: org.hibernate.exception.GenericJDBCException: could not execute statement
Mar 22 14:30:43 rpki rpki-validator-3.sh: at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:47)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:111)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:97)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.executeUpdate(ResultSetReturnImpl.java:178)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at org.hibernate.hql.internal.ast.exec.BasicExecutor.doExecute(BasicExecutor.java:100)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at org.hibernate.hql.internal.ast.exec.DeleteExecutor.execute(DeleteExecutor.java:106)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at org.hibernate.hql.internal.ast.QueryTranslatorImpl.executeUpdate(QueryTranslatorImpl.java:450)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at org.hibernate.engine.query.spi.HQLQueryPlan.performExecuteUpdate(HQLQueryPlan.java:374)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at org.hibernate.internal.SessionImpl.executeUpdate(SessionImpl.java:1514)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at org.hibernate.query.internal.AbstractProducedQuery.doExecuteUpdate(AbstractProducedQuery.java:1526)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at org.hibernate.query.internal.AbstractProducedQuery.executeUpdate(AbstractProducedQuery.java:1504)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at com.querydsl.jpa.impl.JPADeleteClause.execute(JPADeleteClause.java:70)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at net.ripe.rpki.validator3.adapter.jpa.JPAValidationRuns.removeOldCertificateTreeValidationRuns(JPAValidationRuns.java:255)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at net.ripe.rpki.validator3.adapter.jpa.JPAValidationRuns.removeOldValidationRuns(JPAValidationRuns.java:149)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at net.ripe.rpki.validator3.adapter.jpa.JPAValidationRuns$$FastClassBySpringCGLIB$$69d492be.invoke()
Mar 22 14:30:43 rpki rpki-validator-3.sh: at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:746)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
Mar 22 14:30:43 rpki rpki-validator-3.sh: at org.springframework.dao.support.PersistenceExceptionTranslationInterceptor.invoke(PersistenceExceptionTranslationInterceptor.java:139)
Mar 22 14:30:43 rpki rpki-validator-3.sh: ... 19 common frames omitted

But this crash is probably caused by lack of disk space.

lolepezy commented 5 years ago

Hi,

This is definitely not normal, it looks like some other kind of error of the underlying database. I would recommend you to stop the validator, erase all the database files and start it again. If you added ARIN TAL to it, just put the TAL file to 'preconfigured-tals' directory, so it will pick it up on the start. If you made any configuration for white-lists or ignore filters, you can export them to a SLURM file and then import back after restart.

We are working on replacing the storage sub-system, so that we don't have these problems anymore, but for now restarting the validator with erasing the database should help.

wibisono commented 5 years ago

Hi,

This latest release 407 should no longer have the issue you mentioned tar ball | centos rpm

We were improving the DB stability and we had tested it as release candidate for a week in our environment. Loading time should also be improved with this version (change log ).

Cheers.

wibisono commented 5 years ago

We finally released version 3.1 which should no longer suffer from database space issue, since we already replace the underlying persistence layer. Feel free to give it a try:

Docker image: https://hub.docker.com/r/ripencc/rpki-validator-3-docker RPM: https://ftp.ripe.net/tools/rpki/validator3/prod/centos7/repo/ Tarball: https://ftp.ripe.net/tools/rpki/validator3/prod/generic/

Change logs:

d4nys3k commented 5 years ago

Tried upgrade to 3.1-2019.07.04_06.41.46; so far no issues with it. We'll monitor this for couple of days.

d4nys3k commented 5 years ago

I believe this is fixed in the 3.1 version. So far no issues with new DB backend.

DuhBatista commented 4 years ago

good afternoon, friends You have this problem of high memory consumption. Installed version 3.1 ..[

usr/bin/java -Xms512m -Xmx1536m -XX:+ExitOnOutOfMemoryError -Dspring.config.location=classpath:/application.properties,file:/etc/rpki-validator-3/