When the index rollover happens (1 year or 75M objects), we end up with more than 1 index.
It appears that once rollover has been made a file can be reuploaded and created in the second index, ending up in a duplicate.
Additional information
Upon importing a file and creating an InternaFile object in database (file-storage.js upload()), we check that the file already exists only if the option errorOnExisting is set.
Through UI the option is set correctly to true, but not in python-client apparently.
If you access the file that is duplicated, you get an error:
{
"category": "APP",
"errors": [
{
"attributes": {
"genre": "TECHNICAL",
"hits": 2,
"http_status": 500,
"id": "import/pending/test.txt.json"
},
"message": "Id loading expect only one response",
"name": "DATABASE_ERROR",
"stack": "GraphQLError: Id loading expect only one response\n at error (/opt/opencti/build/src/config/errors.js:7:10)\n at DatabaseError (/opt/opencti/build/src/config/errors.js:57:48)\n at elLoadById (/opt/opencti/build/src/database/engine.js:1416:11)\n at processTicksAndRejections (node:internal/process/task_queues:95:5)\n at internalLoadById (/opt/opencti/build/src/database/middleware-loader.ts:583:10)\n at storeLoadById (/opt/opencti/build/src/database/middleware-loader.ts:590:16)\n at loadFile (/opt/opencti/build/src/database/file-storage.js:258:33)\n at deleteFile (/opt/opencti/build/src/database/file-storage.js:116:14)\n at deleteElement (/opt/opencti/build/src/manager/retentionManager.ts:31:5)\n at executeProcessing (/opt/opencti/build/src/manager/retentionManager.ts:83:7)\n at Object.retentionHandler [as handler] (/opt/opencti/build/src/manager/retentionManager.ts:107:7)\n at cronHandler (/opt/opencti/build/src/manager/managerModule.ts:71:11)\n at /opt/opencti/build/src/manager/managerModule.ts:129:11\n at Ilt.#runHandlerAndScheduleTimeout (/opt/opencti/build/node_modules/set-interval-async/dist/set-interval-async-timer.cjs:36:13)\n at Timeout._onTimeout (/opt/opencti/build/node_modules/set-interval-async/dist/set-interval-async-timer.cjs:29:13)"
}
],
"id": "import/pending/test.txt.json",
"level": "error",
"manager": "RETENTION_MANAGER",
"message": "Id loading expect only one response",
"source": "backend",
"timestamp": "2024-08-30T15:37:08.005Z",
"version": "6.2.17"
}
Looking at the elastic DB, we find both versions of the same file, with same _id
Description
When the index rollover happens (1 year or 75M objects), we end up with more than 1 index. It appears that once rollover has been made a file can be reuploaded and created in the second index, ending up in a duplicate.
Additional information
Upon importing a file and creating an
InternaFile
object in database (file-storage.jsupload()
), we check that the file already exists only if the optionerrorOnExisting
is set.Through UI the option is set correctly to true, but not in python-client apparently.
If you access the file that is duplicated, you get an error:
Looking at the elastic DB, we find both versions of the same file, with same
_id