aws-greengrass / aws-greengrass-shadow-manager

A GreengrassV2 Component that provides offline device shadow documents and optional synchronization to the IoT device shadow service.
Apache License 2.0
9 stars 5 forks source link

java.nio.channels.OverlappingFileLockException #190

Closed tomikpik closed 7 months ago

tomikpik commented 9 months ago

Describe the bug We're getting this error in our application using the Stream manager

  File "/usr/lib/python3.6/concurrent/futures/_base.py", line 425, in result
    return self.__get_result()
  File "/usr/lib/python3.6/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
  File "/root/.local/lib/python3.6/site-packages/awsiot/eventstreamrpc.py", line 723, in _on_continuation_message
    raise shape
awsiot.greengrasscoreipc.model.ServiceError

This exception is raised in greengrass.log file

2023-09-11T08:22:28.756Z [ERROR] (AwsEventLoop 1) com.aws.greengrass.shadowmanager.ipc.UpdateThingShadowRequestHandler: handle-update-thing-shadow. Could not process UpdateThingShadow Request due to internal service error. {thing name=XXXX, shadow name=XXXX}
com.aws.greengrass.shadowmanager.exception.ShadowManagerDataException: com.aws.greengrass.shadowmanager.lib.org.h2.jdbc.JdbcSQLNonTransientConnectionException: (Message 90020 not found) [90020-200]
    at com.aws.greengrass.shadowmanager.ShadowManagerDAOImpl.getShadowThing(ShadowManagerDAOImpl.java:75)
    at com.aws.greengrass.shadowmanager.ipc.UpdateThingShadowRequestHandler.lambda$handleRequest$2(UpdateThingShadowRequestHandler.java:121)
    at com.aws.greengrass.ipc.common.ExceptionUtil.translateExceptions(ExceptionUtil.java:33)
    at com.aws.greengrass.shadowmanager.ipc.UpdateThingShadowRequestHandler.handleRequest(UpdateThingShadowRequestHandler.java:94)
    at com.aws.greengrass.shadowmanager.ipc.UpdateThingShadowIPCHandler.handleRequest(UpdateThingShadowIPCHandler.java:83)
    at com.aws.greengrass.shadowmanager.ipc.UpdateThingShadowIPCHandler.handleRequest(UpdateThingShadowIPCHandler.java:29)
    at software.amazon.awssdk.eventstreamrpc.OperationContinuationHandler.onContinuationMessage(OperationContinuationHandler.java:291)
    at software.amazon.awssdk.crt.eventstream.ServerConnectionContinuationHandler.onContinuationMessageShim(ServerConnectionContinuationHandler.java:53)
Caused by: com.aws.greengrass.shadowmanager.lib.org.h2.jdbc.JdbcSQLNonTransientConnectionException: (Message 90020 not found) [90020-200]
    at com.aws.greengrass.shadowmanager.lib.org.h2.message.DbException.getJdbcSQLException(DbException.java:622)
    at com.aws.greengrass.shadowmanager.lib.org.h2.message.DbException.getJdbcSQLException(DbException.java:429)
    at com.aws.greengrass.shadowmanager.lib.org.h2.message.DbException.get(DbException.java:194)
    at com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.db.MVTableEngine$Store.convertIllegalStateException(MVTableEngine.java:206)
    at com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.db.MVTableEngine$Store.open(MVTableEngine.java:182)
    at com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.db.MVTableEngine.init(MVTableEngine.java:103)
    at com.aws.greengrass.shadowmanager.lib.org.h2.engine.Database.getPageStore(Database.java:2659)
    at com.aws.greengrass.shadowmanager.lib.org.h2.engine.Database.open(Database.java:675)
    at com.aws.greengrass.shadowmanager.lib.org.h2.engine.Database.openDatabase(Database.java:307)
    at com.aws.greengrass.shadowmanager.lib.org.h2.engine.Database.<init>(Database.java:301)
    at com.aws.greengrass.shadowmanager.lib.org.h2.engine.Engine.openSession(Engine.java:74)
    at com.aws.greengrass.shadowmanager.lib.org.h2.engine.Engine.openSession(Engine.java:192)
    at com.aws.greengrass.shadowmanager.lib.org.h2.engine.Engine.createSessionAndValidate(Engine.java:171)
    at com.aws.greengrass.shadowmanager.lib.org.h2.engine.Engine.createSession(Engine.java:166)
    at com.aws.greengrass.shadowmanager.lib.org.h2.engine.Engine.createSession(Engine.java:29)
    at com.aws.greengrass.shadowmanager.lib.org.h2.engine.SessionRemote.connectEmbeddedOrServer(SessionRemote.java:340)
    at com.aws.greengrass.shadowmanager.lib.org.h2.jdbc.JdbcConnection.<init>(JdbcConnection.java:173)
    at com.aws.greengrass.shadowmanager.lib.org.h2.jdbc.JdbcConnection.<init>(JdbcConnection.java:152)
    at com.aws.greengrass.shadowmanager.lib.org.h2.Driver.connect(Driver.java:69)
    at com.aws.greengrass.shadowmanager.lib.org.h2.jdbcx.JdbcDataSource.getJdbcConnection(JdbcDataSource.java:189)
    at com.aws.greengrass.shadowmanager.lib.org.h2.jdbcx.JdbcDataSource.getXAConnection(JdbcDataSource.java:352)
    at com.aws.greengrass.shadowmanager.lib.org.h2.jdbcx.JdbcDataSource.getPooledConnection(JdbcDataSource.java:384)
    at com.aws.greengrass.shadowmanager.lib.org.h2.jdbcx.JdbcConnectionPool.getConnectionNow(JdbcConnectionPool.java:234)
    at com.aws.greengrass.shadowmanager.lib.org.h2.jdbcx.JdbcConnectionPool.getConnection(JdbcConnectionPool.java:199)
    at com.aws.greengrass.shadowmanager.ShadowManagerDAOImpl.getShadowThing(ShadowManagerDAOImpl.java:62)
    ... 7 more
Caused by: java.lang.IllegalStateException: The file is locked: nio:/greengrass/v2/work/aws.greengrass.ShadowManager/shadow.mv.db [1.4.200/7]
    at com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.DataUtils.newIllegalStateException(DataUtils.java:950)
    at com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.FileStore.open(FileStore.java:166)
    at com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.MVStore.<init>(MVStore.java:381)
    at com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.MVStore$Builder.open(MVStore.java:3579)
    at com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.db.MVTableEngine$Store.open(MVTableEngine.java:170)
    ... 27 more
Caused by: java.nio.channels.OverlappingFileLockException
    at java.base/sun.nio.ch.FileLockTable.checkList(FileLockTable.java:229)
    at java.base/sun.nio.ch.FileLockTable.add(FileLockTable.java:123)
    at java.base/sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:1321)
    at com.aws.greengrass.shadowmanager.lib.org.h2.store.fs.FileNio.tryLock(FilePathNio.java:121)
    at java.base/java.nio.channels.FileChannel.tryLock(FileChannel.java:1191)
    at com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.FileStore.open(FileStore.java:163)
    ... 30 more

it seems that there is some issue with the access to the database file. This exception is repeatedly thrown almost every second. Both files shadow.mv.db and shadow.trace.db are present in the work directory, I can provide them if you're interested.

To Reproduce I don't know how to reproduce since I don't know the root cause.

Expected behavior To work without issues or to have some error recovery mechanism.

Environment

Thank you in advance Tomas

saranyailla commented 9 months ago

Hi, Thank you for reporting this. Could you provide a bit more context on the issue with the following information?

tomikpik commented 9 months ago

Unfortunately, I am not able to provide exact reproduction steps since it happened on a remotely running production device. However I did explore the logs I have in more depth and it seems that this problem is related to the available disk space.

Please see the attached stacktraces.

2023-09-09T07:13:04.593Z [ERROR] (AwsEventLoop 1) com.aws.greengrass.shadowmanager.ipc.GetThingShadowRequestHandler: handle-get-thing-shadow. Could not process GetThingShadow Request due to internal service error. {thing name=XX, shadow name=DeviceInfo}com.aws.greengrass.shadowmanager.exception.ShadowManagerDataException: com.aws.greengrass.shadowmanager.lib.org.h2.jdbc.JdbcSQLNonTransientConnectionException: (Message 90098 not found) [90098-200]  at 
com.aws.greengrass.shadowmanager.ShadowManagerDAOImpl.getShadowThing(ShadowManagerDAOImpl.java:75)  at 
com.aws.greengrass.shadowmanager.ipc.GetThingShadowRequestHandler.lambda$handleRequest$0(GetThingShadowRequestHandler.java:89)  at 
com.aws.greengrass.ipc.common.ExceptionUtil.translateExceptions(ExceptionUtil.java:33)  at 
com.aws.greengrass.shadowmanager.ipc.GetThingShadowRequestHandler.handleRequest(GetThingShadowRequestHandler.java:75)   at 
com.aws.greengrass.shadowmanager.ipc.GetThingShadowIPCHandler.handleRequest(GetThingShadowIPCHandler.java:82)   at 
com.aws.greengrass.shadowmanager.ipc.GetThingShadowIPCHandler.handleRequest(GetThingShadowIPCHandler.java:28)   at 
software.amazon.awssdk.eventstreamrpc.OperationContinuationHandler.onContinuationMessage(OperationContinuationHandler.java:291) at 
software.amazon.awssdk.crt.eventstream.ServerConnectionContinuationHandler.onContinuationMessageShim(ServerConnectionContinuationHandler.java:53)
Caused by: com.aws.greengrass.shadowmanager.lib.org.h2.jdbc.JdbcSQLNonTransientConnectionException: (Message 90098 not found) [90098-200]   at 
com.aws.greengrass.shadowmanager.lib.org.h2.message.DbException.getJdbcSQLException(DbException.java:622)   at 
com.aws.greengrass.shadowmanager.lib.org.h2.message.DbException.getJdbcSQLException(DbException.java:429)   at 
com.aws.greengrass.shadowmanager.lib.org.h2.message.DbException.get(DbException.java:194)   at 
com.aws.greengrass.shadowmanager.lib.org.h2.engine.Session.getTransaction(Session.java:1792)    at 
com.aws.greengrass.shadowmanager.lib.org.h2.engine.Session.getStatementSavepoint(Session.java:1804) at 
com.aws.greengrass.shadowmanager.lib.org.h2.engine.Session.setSavepoint(Session.java:915)   at 
com.aws.greengrass.shadowmanager.lib.org.h2.command.Command.executeUpdate(Command.java:244) at 
com.aws.greengrass.shadowmanager.lib.org.h2.jdbc.JdbcConnection.rollbackInternal(JdbcConnection.java:1530)  at 
com.aws.greengrass.shadowmanager.lib.org.h2.jdbc.JdbcConnection.rollback(JdbcConnection.java:561)   at 
com.aws.greengrass.shadowmanager.lib.org.h2.jdbcx.JdbcXAConnection.getConnection(JdbcXAConnection.java:106) at 
com.aws.greengrass.shadowmanager.lib.org.h2.jdbcx.JdbcConnectionPool.getConnectionNow(JdbcConnectionPool.java:236)  at 
com.aws.greengrass.shadowmanager.lib.org.h2.jdbcx.JdbcConnectionPool.getConnection(JdbcConnectionPool.java:199) at 
com.aws.greengrass.shadowmanager.ShadowManagerDAOImpl.getShadowThing(ShadowManagerDAOImpl.java:62)  ... 7 more
Caused by: java.lang.IllegalStateException: Writing to nio:/greengrass/v2/work/aws.greengrass.ShadowManager/shadow.mv.db failed; length 4096 at 131072 [1.4.200/2]  at 
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.DataUtils.newIllegalStateException(DataUtils.java:950)  at 
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.DataUtils.writeFully(DataUtils.java:479)    at 
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.FileStore.writeFully(FileStore.java:113)    at 
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.MVStore.write(MVStore.java:1111)    at 
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.MVStore.moveChunk(MVStore.java:1935)    at 
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.MVStore.compactMoveChunks(MVStore.java:1839)    at 
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.MVStore.compactMoveChunks(MVStore.java:1772)    at 
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.MVStore.doMaintenance(MVStore.java:2848)    at 
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.MVStore.writeInBackground(MVStore.java:2788)    at 
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.MVStore$BackgroundWriterThread.run(MVStore.java:3290)
Caused by: java.io.IOException: No space left on device at 
java.base/sun.nio.ch.FileDispatcherImpl.pwrite0(Native Method)  at 
java.base/sun.nio.ch.FileDispatcherImpl.pwrite(FileDispatcherImpl.java:68)  at 
java.base/sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:130)  at 
java.base/sun.nio.ch.IOUtil.write(IOUtil.java:97)   at 
java.base/sun.nio.ch.IOUtil.write(IOUtil.java:67)   at 
java.base/sun.nio.ch.FileChannelImpl.writeInternal(FileChannelImpl.java:878)    at 
java.base/sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:864)    at 
com.aws.greengrass.shadowmanager.lib.org.h2.store.fs.FileNio.write(FilePathNio.java:79) at 
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.DataUtils.writeFully(DataUtils.java:475)    ... 8 more
2023-09-11T08:46:33.971Z [ERROR] (AwsEventLoop 1) com.aws.greengrass.shadowmanager.ipc.UpdateThingShadowRequestHandler: handle-update-thing-shadow. Could not process UpdateThingShadow Request due to internal service error. {thing name=XX, shadow name=StreamInfo_9}com.aws.greengrass.shadowmanager.exception.ShadowManagerDataException: java.util.concurrent.ExecutionException: com.aws.greengrass.shadowmanager.exception.ShadowManagerDataException: com.aws.greengrass.shadowmanager.lib.org.h2.jdbc.JdbcSQLNonTransientConnectionException: (Message 90020 not found) [90020-200]        at 
com.aws.greengrass.shadowmanager.ShadowManagerDAOImpl.executeWriteOperation(ShadowManagerDAOImpl.java:389)        at 
com.aws.greengrass.shadowmanager.ShadowManagerDAOImpl.updateShadowThing(ShadowManagerDAOImpl.java:137)        at 
com.aws.greengrass.shadowmanager.ipc.UpdateThingShadowRequestHandler.lambda$handleRequest$2(UpdateThingShadowRequestHandler.java:215)        at 
com.aws.greengrass.ipc.common.ExceptionUtil.translateExceptions(ExceptionUtil.java:33)        at 
com.aws.greengrass.shadowmanager.ipc.UpdateThingShadowRequestHandler.handleRequest(UpdateThingShadowRequestHandler.java:94)        at 
com.aws.greengrass.shadowmanager.ipc.UpdateThingShadowIPCHandler.handleRequest(UpdateThingShadowIPCHandler.java:83)        at 
com.aws.greengrass.shadowmanager.ipc.UpdateThingShadowIPCHandler.handleRequest(UpdateThingShadowIPCHandler.java:29)        at 
software.amazon.awssdk.eventstreamrpc.OperationContinuationHandler.onContinuationMessage(OperationContinuationHandler.java:291)        at 
software.amazon.awssdk.crt.eventstream.ServerConnectionContinuationHandler.onContinuationMessageShim(ServerConnectionContinuationHandler.java:53)
Caused by: java.util.concurrent.ExecutionException: com.aws.greengrass.shadowmanager.exception.ShadowManagerDataException: com.aws.greengrass.shadowmanager.lib.org.h2.jdbc.JdbcSQLNonTransientConnectionException: (Message 90020 not found) [90020-200]        at 
java.base/java.util.concurrent.FutureTask.report(FutureTask.java:122)        at 
java.base/java.util.concurrent.FutureTask.get(FutureTask.java:205)        at 
com.aws.greengrass.shadowmanager.ShadowManagerDAOImpl.executeWriteOperation(ShadowManagerDAOImpl.java:383)        ... 8 more
Caused by: com.aws.greengrass.shadowmanager.exception.ShadowManagerDataException: com.aws.greengrass.shadowmanager.lib.org.h2.jdbc.JdbcSQLNonTransientConnectionException: (Message 90020 not found) [90020-200]        at 
com.aws.greengrass.shadowmanager.ShadowManagerDAOImpl.execute(ShadowManagerDAOImpl.java:376)        at 
com.aws.greengrass.shadowmanager.ShadowManagerDAOImpl.lambda$executeWriteOperation$10(ShadowManagerDAOImpl.java:383)        at 
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)        at 
java.base/java.lang.Thread.run(Thread.java:833)
Caused by: com.aws.greengrass.shadowmanager.lib.org.h2.jdbc.JdbcSQLNonTransientConnectionException: (Message 90020 not found) [90020-200]        at 
com.aws.greengrass.shadowmanager.lib.org.h2.message.DbException.getJdbcSQLException(DbException.java:622)        at 
com.aws.greengrass.shadowmanager.lib.org.h2.message.DbException.getJdbcSQLException(DbException.java:429)        at 
com.aws.greengrass.shadowmanager.lib.org.h2.message.DbException.get(DbException.java:194)        at 
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.db.MVTableEngine$Store.convertIllegalStateException(MVTableEngine.java:206)        at 
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.db.MVTableEngine$Store.open(MVTableEngine.java:182)        at 
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.db.MVTableEngine.init(MVTableEngine.java:103)        at 
com.aws.greengrass.shadowmanager.lib.org.h2.engine.Database.getPageStore(Database.java:2659)        at 
com.aws.greengrass.shadowmanager.lib.org.h2.engine.Database.open(Database.java:675)        at 
com.aws.greengrass.shadowmanager.lib.org.h2.engine.Database.openDatabase(Database.java:307)        at 
com.aws.greengrass.shadowmanager.lib.org.h2.engine.Database.<init>(Database.java:301)        at 
com.aws.greengrass.shadowmanager.lib.org.h2.engine.Engine.openSession(Engine.java:74)        at 
com.aws.greengrass.shadowmanager.lib.org.h2.engine.Engine.openSession(Engine.java:192)        at 
com.aws.greengrass.shadowmanager.lib.org.h2.engine.Engine.createSessionAndValidate(Engine.java:171)        at 
com.aws.greengrass.shadowmanager.lib.org.h2.engine.Engine.createSession(Engine.java:166)        at 
com.aws.greengrass.shadowmanager.lib.org.h2.engine.Engine.createSession(Engine.java:29)        at 
com.aws.greengrass.shadowmanager.lib.org.h2.engine.SessionRemote.connectEmbeddedOrServer(SessionRemote.java:340)        at 
com.aws.greengrass.shadowmanager.lib.org.h2.jdbc.JdbcConnection.<init>(JdbcConnection.java:173)        at 
com.aws.greengrass.shadowmanager.lib.org.h2.jdbc.JdbcConnection.<init>(JdbcConnection.java:152)        at 
com.aws.greengrass.shadowmanager.lib.org.h2.Driver.connect(Driver.java:69)        at 
com.aws.greengrass.shadowmanager.lib.org.h2.jdbcx.JdbcDataSource.getJdbcConnection(JdbcDataSource.java:189)        at 
com.aws.greengrass.shadowmanager.lib.org.h2.jdbcx.JdbcDataSource.getXAConnection(JdbcDataSource.java:352)        at 
com.aws.greengrass.shadowmanager.lib.org.h2.jdbcx.JdbcDataSource.getPooledConnection(JdbcDataSource.java:384)        at 
com.aws.greengrass.shadowmanager.lib.org.h2.jdbcx.JdbcConnectionPool.getConnectionNow(JdbcConnectionPool.java:234)        at 
com.aws.greengrass.shadowmanager.lib.org.h2.jdbcx.JdbcConnectionPool.getConnection(JdbcConnectionPool.java:199)        at 
com.aws.greengrass.shadowmanager.ShadowManagerDAOImpl.execute(ShadowManagerDAOImpl.java:371)        ... 5 more
Caused by: java.lang.IllegalStateException: The file is locked: nio:/greengrass/v2/work/aws.greengrass.ShadowManager/shadow.mv.db [1.4.200/7]        at 
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.DataUtils.newIllegalStateException(DataUtils.java:950)        at 
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.FileStore.open(FileStore.java:166)        at 
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.MVStore.<init>(MVStore.java:381)        at 
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.MVStore$Builder.open(MVStore.java:3579)        at 
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.db.MVTableEngine$Store.open(MVTableEngine.java:170)        ... 25 more
Caused by: java.nio.channels.OverlappingFileLockException        at 
java.base/sun.nio.ch.FileLockTable.checkList(FileLockTable.java:229)        at 
java.base/sun.nio.ch.FileLockTable.add(FileLockTable.java:123)        at 
java.base/sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:1321)        at 
com.aws.greengrass.shadowmanager.lib.org.h2.store.fs.FileNio.tryLock(FilePathNio.java:121)        at 
java.base/java.nio.channels.FileChannel.tryLock(FileChannel.java:1191)        at 
com.aws.greengrass.shadowmanager.lib.org.h2.mvstore.FileStore.open(FileStore.java:163)        ... 28 more

It seems that it is also related to new deployments when new component artifacts are downloaded:

2023-09-08T10:03:34.728Z [ERROR] (pool-2-thread-10) com.aws.greengrass.componentmanager.ComponentManager: Failed to prepare package aws.greengrass.Nucleus-v2.11.2. {}com.aws.greengrass.componentmanager.exceptions.PackageDownloadException: Failed to download artifact name: 'greengrass:a0vWXgWTiRMrB5_oPpp5W163Zz1u1cP_2P9vcBuuU0Y=/aws.greengrass.nucleus.zip' for component aws.greengrass.Nucleus-2.11.2, reason: Error writing artifact   at 
com.aws.greengrass.componentmanager.builtins.ArtifactDownloader.download(ArtifactDownloader.java:181)   at 
com.aws.greengrass.componentmanager.builtins.GreengrassRepositoryDownloader.lambda$download$1(GreengrassRepositoryDownloader.java:160)  at 
com.aws.greengrass.util.RetryUtils.runWithRetry(RetryUtils.java:50) at 
com.aws.greengrass.componentmanager.builtins.GreengrassRepositoryDownloader.download(GreengrassRepositoryDownloader.java:145)   at 
com.aws.greengrass.componentmanager.builtins.ArtifactDownloader.lambda$download$0(ArtifactDownloader.java:134)  at 
com.aws.greengrass.util.RetryUtils.runWithRetry(RetryUtils.java:50) at 
com.aws.greengrass.componentmanager.builtins.ArtifactDownloader.download(ArtifactDownloader.java:132)   at 
com.aws.greengrass.componentmanager.ComponentManager.prepareArtifacts(ComponentManager.java:453)    at 
com.aws.greengrass.componentmanager.ComponentManager.preparePackage(ComponentManager.java:399)  at 
com.aws.greengrass.componentmanager.ComponentManager.lambda$preparePackages$1(ComponentManager.java:360)    at 
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)  at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)   at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)   at 
java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.nio.channels.ClosedByInterruptException at 
java.base/java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:199) at 
java.base/sun.nio.ch.FileChannelImpl.endBlocking(FileChannelImpl.java:171)  at 
java.base/sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:293)    at 
java.base/java.nio.channels.Channels.writeFullyImpl(Channels.java:74)   at 
java.base/java.nio.channels.Channels.writeFully(Channels.java:96)   at 
java.base/java.nio.channels.Channels$1.write(Channels.java:171) at 
com.aws.greengrass.componentmanager.builtins.ArtifactDownloader.download(ArtifactDownloader.java:179)   ... 13 more

Everything seems to be running fine after docker container recreation. I understand this issue is probably not a bug, but I'd like to know what is the correct recovery mechanism in such case. I understand we should do as much as possible to secure that the device does not run out of disk space, but it can happen and I'd like to be prepared for it.

Questions:

Thank you very much Tomas

saranyailla commented 9 months ago

Hi, If there's no space on the disk, shadow manager will fail to update the DB. From the logs you shared, it's unclear how shadow manager got into OverlappingFileLockException state from no space left IOException as the logs are missing.

If the shadow manager is in the OverlappingFileLockException state, it cannot recover itself unless Greengrass is restarted. So, shadow manager needs a fix to prevent itself from entering this state.

Although, deleting the shadow manager DB file will clear up some space, it also removes the lock shadow state. If you still want to delete the db file, you should stop Greengrass, delete the file and then restart the Greengrass. This will allow the shadow manager to recreate the DB.

MikeDombo commented 7 months ago

No response for 2 months so I'm going to close this issue. Please reopen if necessary.