Joystream / joystream

Joystream Monorepo
http://www.joystream.org
GNU General Public License v3.0
1.42k stars 115 forks source link

[Bug] Object upload request to the storage-node fail even though the call to `accept_pending_data_objects` in the same request succeeds #4968

Closed zeeshanakram3 closed 11 months ago

zeeshanakram3 commented 11 months ago

Context

        "trace": [
          {
            "file": "/joystream/node_modules/promise-timeout/index.js",
            "method": "TimeoutError",
            "native": false,
            "line": 44,
            "function": "new module.exports.TimeoutError",
            "column": 16
          },
          {
            "file": "/joystream/node_modules/promise-timeout/index.js",
            "method": "timeout",
            "native": false,
            "line": 20,
            "function": "module.exports.timeout",
            "column": 15
          },
          {
            "file": "/joystream/storage-node/lib/services/runtime/extrinsics.js",
            "method": null,
            "native": false,
            "line": 207,
            "function": "extrinsicWrapper",
            "column": 45
          },
          {
            "file": "/joystream/storage-node/lib/services/runtime/extrinsics.js",
            "method": null,
            "native": false,
            "line": 115,
            "function": "acceptPendingDataObjects",
            "column": 18
          },
          {
            "file": "/joystream/storage-node/lib/services/webApi/controllers/filesApi.js",
            "method": null,
            "native": false,
            "line": 99,
            "function": "uploadFile",
            "column": 57
          }
        ],

As recorded, some data object upload requests (endpoint: POST /api/v1/files) to the storage node fail (Error 500: Timeout) even though the call to accept_pending_data_objects in the same request succeeds. This is happening because the extrinsicWrapper - which calls the accept_pending_data_objects after the object upload has been completed on the storage node and the storage node verifies that the object is valid (e.g. it's size, and hash matches the runtime values) - is hardcoded with 25s default timeout.

_Now, the problem is that after sending the accept_pending_data_objects tx, extrinsicWrapper will timeout after 25s, even though tx may get included in the block after this timeout and eventually gets succeeded, so the side-effect in runtime has happened and the storage-node is not aware of it, or incorrectly it assumes that every timed out tx would eventually fail._

Implications

After the error is thrown from the acceptPendingDataObjects the catch block of the calling function performs the file cleanup, so major implication of this bug is that if the object is successfully accepted in the runtime, then the cleanup step effectively leads to the permanent loss of the data object, since storage-node by design also does not allow re-uploading of the accepted object as a measure against denial of services attack.