Closed tomikpik closed 1 year ago
Hi, Thank you for reporting this. Could you provide a bit more context on the issue with the following information?
Unfortunately, I am not able to provide exact reproduction steps since it happened on a remotely running production device. However I did explore the logs I have in more depth and it seems that this problem is related to the available disk space.
Please see the attached stacktraces.
2023-09-09T07:13:04.593Z [ERROR] (AwsEventLoop 1) com.aws.greengrass.shadowmanager.ipc.GetThingShadowRequestHandler: handle-get-thing-shadow. Could not process GetThingShadow Request due to internal service error. {thing name=XX, shadow name=DeviceInfo}com.aws.greengrass.shadowmanager.exception.ShadowManagerDataException: com.aws.greengrass.shadowmanager.lib.org.h2.jdbc.JdbcSQLNonTransientConnectionException: (Message 90098 not found) [90098-200] at
com.aws.greengrass.shadowmanager.ShadowManagerDAOImpl.getShadowThing(ShadowManagerDAOImpl.java:75) at
com.aws.greengrass.shadowmanager.ipc.GetThingShadowRequestHandler.lambda$handleRequest$0(GetThingShadowRequestHandler.java:89) at
com.aws.greengrass.ipc.common.ExceptionUtil.translateExceptions(ExceptionUtil.java:33) at
com.aws.greengrass.shadowmanager.ipc.GetThingShadowRequestHandler.handleRequest(GetThingShadowRequestHandler.java:75) at
com.aws.greengrass.shadowmanager.ipc.GetThingShadowIPCHandler.handleRequest(GetThingShadowIPCHandler.java:82) at
com.aws.greengrass.shadowmanager.ipc.GetThingShadowIPCHandler.handleRequest(GetThingShadowIPCHandler.java:28) at
software.amazon.awssdk.eventstreamrpc.OperationContinuationHandler.onContinuationMessage(OperationContinuationHandler.java:291) at
software.amazon.awssdk.crt.eventstream.ServerConnectionContinuationHandler.onContinuationMessageShim(ServerConnectionContinuationHandler.java:53)
Caused by: com.aws.greengrass.shadowmanager.lib.org.h2.jdbc.JdbcSQLNonTransientConnectionException: (Message 90098 not found) [90098-200] at
com.aws.greengrass.shadowmanager.lib.org.h2.message.DbException.getJdbcSQLException(DbException.java:622) at
com.aws.greengrass.shadowmanager.lib.org.h2.message.DbException.getJdbcSQLException(DbException.java:429) at
com.aws.greengrass.shadowmanager.lib.org.h2.message.DbException.get(DbException.java:194) at
com.aws.greengrass.shadowmanager.lib.org.h2.engine.Session.getTransaction(Session.java:1792) at
com.aws.greengrass.shadowmanager.lib.org.h2.engine.Session.getStatementSavepoint(Session.java:1804) at
com.aws.greengrass.shadowmanager.lib.org.h2.engine.Session.setSavepoint(Session.java:915) at
com.aws.greengrass.shadowmanager.lib.org.h2.command.Command.executeUpdate(Command.java:244) at
com.aws.greengrass.shadowmanager.lib.org.h2.jdbc.JdbcConnection.rollbackInternal(JdbcConnection.java:1530) at
com.aws.greengrass.shadowmanager.lib.org.h2.jdbc.JdbcConnection.rollback(JdbcConnection.java:561) at
com.aws.greengrass.shadowmanager.lib.org.h2.jdbcx.JdbcXAConnection.getConnection(JdbcXAConnection.java:106) at
com.aws.greengrass.shadowmanager.lib.org.h2.jdbcx.JdbcConnectionPool.getConnectionNow(JdbcConnectionPool.java:236) at
com.aws.greengrass.shadowmanager.lib.org.h2.jdbcx.JdbcConnectionPool.getConnection(JdbcConnectionPool.java:199) at
com.aws.greengrass.shadowmanager.ShadowManagerDAOImpl.getShadowThing(ShadowManagerDAOImpl.java:62) ... 7 more
Caused by: java.lang.IllegalStateException: Writing to nio:/greengrass/v2/work/aws.greengrass.ShadowManager/shadow.mv.db failed; length 4096 at 131072 [1.4.200/2] at
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.DataUtils.newIllegalStateException(DataUtils.java:950) at
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.DataUtils.writeFully(DataUtils.java:479) at
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.FileStore.writeFully(FileStore.java:113) at
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.MVStore.write(MVStore.java:1111) at
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.MVStore.moveChunk(MVStore.java:1935) at
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.MVStore.compactMoveChunks(MVStore.java:1839) at
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.MVStore.compactMoveChunks(MVStore.java:1772) at
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.MVStore.doMaintenance(MVStore.java:2848) at
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.MVStore.writeInBackground(MVStore.java:2788) at
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.MVStore$BackgroundWriterThread.run(MVStore.java:3290)
Caused by: java.io.IOException: No space left on device at
java.base/sun.nio.ch.FileDispatcherImpl.pwrite0(Native Method) at
java.base/sun.nio.ch.FileDispatcherImpl.pwrite(FileDispatcherImpl.java:68) at
java.base/sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:130) at
java.base/sun.nio.ch.IOUtil.write(IOUtil.java:97) at
java.base/sun.nio.ch.IOUtil.write(IOUtil.java:67) at
java.base/sun.nio.ch.FileChannelImpl.writeInternal(FileChannelImpl.java:878) at
java.base/sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:864) at
com.aws.greengrass.shadowmanager.lib.org.h2.store.fs.FileNio.write(FilePathNio.java:79) at
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.DataUtils.writeFully(DataUtils.java:475) ... 8 more
2023-09-11T08:46:33.971Z [ERROR] (AwsEventLoop 1) com.aws.greengrass.shadowmanager.ipc.UpdateThingShadowRequestHandler: handle-update-thing-shadow. Could not process UpdateThingShadow Request due to internal service error. {thing name=XX, shadow name=StreamInfo_9}com.aws.greengrass.shadowmanager.exception.ShadowManagerDataException: java.util.concurrent.ExecutionException: com.aws.greengrass.shadowmanager.exception.ShadowManagerDataException: com.aws.greengrass.shadowmanager.lib.org.h2.jdbc.JdbcSQLNonTransientConnectionException: (Message 90020 not found) [90020-200] at
com.aws.greengrass.shadowmanager.ShadowManagerDAOImpl.executeWriteOperation(ShadowManagerDAOImpl.java:389) at
com.aws.greengrass.shadowmanager.ShadowManagerDAOImpl.updateShadowThing(ShadowManagerDAOImpl.java:137) at
com.aws.greengrass.shadowmanager.ipc.UpdateThingShadowRequestHandler.lambda$handleRequest$2(UpdateThingShadowRequestHandler.java:215) at
com.aws.greengrass.ipc.common.ExceptionUtil.translateExceptions(ExceptionUtil.java:33) at
com.aws.greengrass.shadowmanager.ipc.UpdateThingShadowRequestHandler.handleRequest(UpdateThingShadowRequestHandler.java:94) at
com.aws.greengrass.shadowmanager.ipc.UpdateThingShadowIPCHandler.handleRequest(UpdateThingShadowIPCHandler.java:83) at
com.aws.greengrass.shadowmanager.ipc.UpdateThingShadowIPCHandler.handleRequest(UpdateThingShadowIPCHandler.java:29) at
software.amazon.awssdk.eventstreamrpc.OperationContinuationHandler.onContinuationMessage(OperationContinuationHandler.java:291) at
software.amazon.awssdk.crt.eventstream.ServerConnectionContinuationHandler.onContinuationMessageShim(ServerConnectionContinuationHandler.java:53)
Caused by: java.util.concurrent.ExecutionException: com.aws.greengrass.shadowmanager.exception.ShadowManagerDataException: com.aws.greengrass.shadowmanager.lib.org.h2.jdbc.JdbcSQLNonTransientConnectionException: (Message 90020 not found) [90020-200] at
java.base/java.util.concurrent.FutureTask.report(FutureTask.java:122) at
java.base/java.util.concurrent.FutureTask.get(FutureTask.java:205) at
com.aws.greengrass.shadowmanager.ShadowManagerDAOImpl.executeWriteOperation(ShadowManagerDAOImpl.java:383) ... 8 more
Caused by: com.aws.greengrass.shadowmanager.exception.ShadowManagerDataException: com.aws.greengrass.shadowmanager.lib.org.h2.jdbc.JdbcSQLNonTransientConnectionException: (Message 90020 not found) [90020-200] at
com.aws.greengrass.shadowmanager.ShadowManagerDAOImpl.execute(ShadowManagerDAOImpl.java:376) at
com.aws.greengrass.shadowmanager.ShadowManagerDAOImpl.lambda$executeWriteOperation$10(ShadowManagerDAOImpl.java:383) at
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at
java.base/java.lang.Thread.run(Thread.java:833)
Caused by: com.aws.greengrass.shadowmanager.lib.org.h2.jdbc.JdbcSQLNonTransientConnectionException: (Message 90020 not found) [90020-200] at
com.aws.greengrass.shadowmanager.lib.org.h2.message.DbException.getJdbcSQLException(DbException.java:622) at
com.aws.greengrass.shadowmanager.lib.org.h2.message.DbException.getJdbcSQLException(DbException.java:429) at
com.aws.greengrass.shadowmanager.lib.org.h2.message.DbException.get(DbException.java:194) at
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.db.MVTableEngine$Store.convertIllegalStateException(MVTableEngine.java:206) at
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.db.MVTableEngine$Store.open(MVTableEngine.java:182) at
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.db.MVTableEngine.init(MVTableEngine.java:103) at
com.aws.greengrass.shadowmanager.lib.org.h2.engine.Database.getPageStore(Database.java:2659) at
com.aws.greengrass.shadowmanager.lib.org.h2.engine.Database.open(Database.java:675) at
com.aws.greengrass.shadowmanager.lib.org.h2.engine.Database.openDatabase(Database.java:307) at
com.aws.greengrass.shadowmanager.lib.org.h2.engine.Database.<init>(Database.java:301) at
com.aws.greengrass.shadowmanager.lib.org.h2.engine.Engine.openSession(Engine.java:74) at
com.aws.greengrass.shadowmanager.lib.org.h2.engine.Engine.openSession(Engine.java:192) at
com.aws.greengrass.shadowmanager.lib.org.h2.engine.Engine.createSessionAndValidate(Engine.java:171) at
com.aws.greengrass.shadowmanager.lib.org.h2.engine.Engine.createSession(Engine.java:166) at
com.aws.greengrass.shadowmanager.lib.org.h2.engine.Engine.createSession(Engine.java:29) at
com.aws.greengrass.shadowmanager.lib.org.h2.engine.SessionRemote.connectEmbeddedOrServer(SessionRemote.java:340) at
com.aws.greengrass.shadowmanager.lib.org.h2.jdbc.JdbcConnection.<init>(JdbcConnection.java:173) at
com.aws.greengrass.shadowmanager.lib.org.h2.jdbc.JdbcConnection.<init>(JdbcConnection.java:152) at
com.aws.greengrass.shadowmanager.lib.org.h2.Driver.connect(Driver.java:69) at
com.aws.greengrass.shadowmanager.lib.org.h2.jdbcx.JdbcDataSource.getJdbcConnection(JdbcDataSource.java:189) at
com.aws.greengrass.shadowmanager.lib.org.h2.jdbcx.JdbcDataSource.getXAConnection(JdbcDataSource.java:352) at
com.aws.greengrass.shadowmanager.lib.org.h2.jdbcx.JdbcDataSource.getPooledConnection(JdbcDataSource.java:384) at
com.aws.greengrass.shadowmanager.lib.org.h2.jdbcx.JdbcConnectionPool.getConnectionNow(JdbcConnectionPool.java:234) at
com.aws.greengrass.shadowmanager.lib.org.h2.jdbcx.JdbcConnectionPool.getConnection(JdbcConnectionPool.java:199) at
com.aws.greengrass.shadowmanager.ShadowManagerDAOImpl.execute(ShadowManagerDAOImpl.java:371) ... 5 more
Caused by: java.lang.IllegalStateException: The file is locked: nio:/greengrass/v2/work/aws.greengrass.ShadowManager/shadow.mv.db [1.4.200/7] at
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.DataUtils.newIllegalStateException(DataUtils.java:950) at
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.FileStore.open(FileStore.java:166) at
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.MVStore.<init>(MVStore.java:381) at
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.MVStore$Builder.open(MVStore.java:3579) at
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.db.MVTableEngine$Store.open(MVTableEngine.java:170) ... 25 more
Caused by: java.nio.channels.OverlappingFileLockException at
java.base/sun.nio.ch.FileLockTable.checkList(FileLockTable.java:229) at
java.base/sun.nio.ch.FileLockTable.add(FileLockTable.java:123) at
java.base/sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:1321) at
com.aws.greengrass.shadowmanager.lib.org.h2.store.fs.FileNio.tryLock(FilePathNio.java:121) at
java.base/java.nio.channels.FileChannel.tryLock(FileChannel.java:1191) at
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.FileStore.open(FileStore.java:163) ... 28 more
It seems that it is also related to new deployments when new component artifacts are downloaded:
2023-09-08T10:03:34.728Z [ERROR] (pool-2-thread-10) com.aws.greengrass.componentmanager.ComponentManager: Failed to prepare package aws.greengrass.Nucleus-v2.11.2. {}com.aws.greengrass.componentmanager.exceptions.PackageDownloadException: Failed to download artifact name: 'greengrass:a0vWXgWTiRMrB5_oPpp5W163Zz1u1cP_2P9vcBuuU0Y=/aws.greengrass.nucleus.zip' for component aws.greengrass.Nucleus-2.11.2, reason: Error writing artifact at
com.aws.greengrass.componentmanager.builtins.ArtifactDownloader.download(ArtifactDownloader.java:181) at
com.aws.greengrass.componentmanager.builtins.GreengrassRepositoryDownloader.lambda$download$1(GreengrassRepositoryDownloader.java:160) at
com.aws.greengrass.util.RetryUtils.runWithRetry(RetryUtils.java:50) at
com.aws.greengrass.componentmanager.builtins.GreengrassRepositoryDownloader.download(GreengrassRepositoryDownloader.java:145) at
com.aws.greengrass.componentmanager.builtins.ArtifactDownloader.lambda$download$0(ArtifactDownloader.java:134) at
com.aws.greengrass.util.RetryUtils.runWithRetry(RetryUtils.java:50) at
com.aws.greengrass.componentmanager.builtins.ArtifactDownloader.download(ArtifactDownloader.java:132) at
com.aws.greengrass.componentmanager.ComponentManager.prepareArtifacts(ComponentManager.java:453) at
com.aws.greengrass.componentmanager.ComponentManager.preparePackage(ComponentManager.java:399) at
com.aws.greengrass.componentmanager.ComponentManager.lambda$preparePackages$1(ComponentManager.java:360) at
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at
java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.nio.channels.ClosedByInterruptException at
java.base/java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:199) at
java.base/sun.nio.ch.FileChannelImpl.endBlocking(FileChannelImpl.java:171) at
java.base/sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:293) at
java.base/java.nio.channels.Channels.writeFullyImpl(Channels.java:74) at
java.base/java.nio.channels.Channels.writeFully(Channels.java:96) at
java.base/java.nio.channels.Channels$1.write(Channels.java:171) at
com.aws.greengrass.componentmanager.builtins.ArtifactDownloader.download(ArtifactDownloader.java:179) ... 13 more
Everything seems to be running fine after docker container recreation. I understand this issue is probably not a bug, but I'd like to know what is the correct recovery mechanism in such case. I understand we should do as much as possible to secure that the device does not run out of disk space, but it can happen and I'd like to be prepared for it.
Questions:
Thank you very much Tomas
Hi,
If there's no space on the disk, shadow manager will fail to update the DB. From the logs you shared, it's unclear how shadow manager got into OverlappingFileLockException
state from no space left IOException
as the logs are missing.
If the shadow manager is in the OverlappingFileLockException
state, it cannot recover itself unless Greengrass is restarted. So, shadow manager needs a fix to prevent itself from entering this state.
Although, deleting the shadow manager DB file will clear up some space, it also removes the lock shadow state. If you still want to delete the db file, you should stop Greengrass, delete the file and then restart the Greengrass. This will allow the shadow manager to recreate the DB.
No response for 2 months so I'm going to close this issue. Please reopen if necessary.
Describe the bug We're getting this error in our application using the Stream manager
This exception is raised in greengrass.log file
it seems that there is some issue with the access to the database file. This exception is repeatedly thrown almost every second. Both files shadow.mv.db and shadow.trace.db are present in the work directory, I can provide them if you're interested.
To Reproduce I don't know how to reproduce since I don't know the root cause.
Expected behavior To work without issues or to have some error recovery mechanism.
Environment
Thank you in advance Tomas