owncloud / ocis

:atom_symbol: ownCloud Infinite Scale Stack
https://doc.owncloud.com/ocis/next/
Apache License 2.0
1.41k stars 184 forks source link

async propagation logs errors #10522

Open butonic opened 2 weeks ago

butonic commented 2 weeks ago

when the async propagation tries to propagate to a node that was deleted it might log sth like this:

    2024-11-07 15: 06: 03.980   
{
        "level": "error",
        "pid": 1,
        "pid": 1,
        "method": "async.Propagate",
        "spaceid": "535c46b8-aeac-4868-976e-09dd1ad10798",
        "nodeid": "435c16a0-4d70-4a99-b3ce-e2d20c04ad7c",
        "parentid": "",
        "sizediff": 0,
        "error": "node.XattrsWithReader /var/lib/ocis/storage/users/spaces/53/5c46b8-aeac-4868-976e-09dd1ad10798/nodes: no data available",
        "time": "2024-11-07t14:06:03Z",
        "caller": "runtime/asm_amd64.s:1700",
        "message": "Propagation failed. Could not read node."
    }

2024-11-07 15: 06: 00.041   
{
        "level": "error",
        "pid": 1,
        "pid": 1,
        "request-id": "c04e0583a154d766ebf8466f483fdc0e",
        "method": "async.Propagate",
        "spaceid": "535c46b8-aeac-4868-976e-09dd1ad10798",
        "nodeid": "3353ccfd-5605-47c5-843d-74fe881bc442",
        "parentid": "435c16a0-4d70-4a99-b3ce-e2d20c04ad7c",
        "sizediff": 20000000,
        "error": "internal error: Missing parent ID on node",
        "time": "2024-11-07t14:06:00Z",
        "caller": "runtime/asm_amd64.s:1700",
        "message": "Propagation failed. Could not read node."
    }

2024-11-07 15: 06: 00.041   
{
        "level": "error",
        "pid": 1,
        "pid": 1,
        "request-id": "c04e0583a154d766ebf8466f483fdc0e",
        "path": "/var/lib/ocis/storage/users/spaces/53/5c46b8-aeac-4868-976e-09dd1ad10798/nodes/43/5c/16/a0/-4d70-4a99-b3ce-e2d20c04ad7c",
        "nodeid": "435c16a0-4d70-4a99-b3ce-e2d20c04ad7c",
        "attrs": {
            "user.ocis.tmtime": "MjAyNC0xMS0wN1QxNDowNTo1OC43MDk5MjkwNDJa"
        },
        "messagepack": "\ufffd\ufffduser.ocis.tmtime\ufffd\u001e2024-11-07t14:05:58.709929042Z",
        "time": "2024-11-07t14:06:00Z",
        "caller": "github.com/cs3org/reva/v2@v2.26.3/pkg/storage/utils/decomposedfs/tree/propagator/async.go:219",
        "message": "missing parent id"
    }

2024-11-07 15: 05: 57.255   
{
        "level": "info",
        "service": "storage-users",
        "pkg": "rgrpc",
        "traceid": "7aeae87a766637febdaf515cd3bc07a5",
        "method": "async.Propagate",
        "spaceid": "535c46b8-aeac-4868-976e-09dd1ad10798",
        "nodeid": "435c16a0-4d70-4a99-b3ce-e2d20c04ad7c",
        "parentid": "535c46b8-aeac-4868-976e-09dd1ad10798",
        "sizediff": 0,
        "time": "2024-11-07t14:05:57Z",
        "line": "github.com/cs3org/reva/v2@v2.26.3/pkg/storage/utils/decomposedfs/tree/propagator/async.go:414",
        "message": "Propagation done. cleaning up"
    }

It will also write an mpk file that only contains the tmtime:

storageusers-85989ddf75-5kmhm:~$ strings /var/lib/ocis/storage/users/spaces/53/5c46b8-aeac-4868-976e-09dd1ad10798/nodes/43/5c/16/a0/-4d70-4a99-b3ce-e2d20c04ad7c.mpk 
user.ocis.tmtime
2024-11-07T14:05:58.709929042Z

We do lock the node metadata when propagationg, but the delete codepath does not seem to care and just deletes the node. This causes the SetMultiple() implementation to just recreate the file:

    msgBytes, err = os.ReadFile(metaPath)
    subspan.End()
    attribs := map[string][]byte{}
    switch {
    case err != nil:
        if !errors.Is(err, fs.ErrNotExist) {
            return err
        }
    case len(msgBytes) == 0:
        // ugh. an empty file? bail out
        return errors.New("encountered empty metadata file")
    default:
        // only unmarshal if we read data
        err = msgpack.Unmarshal(msgBytes, &attribs)
        if err != nil {
            return err
        }
    }

we need to find a way to either not delete the mpk file if it is locked, or detect that tha file was deleted in the propagation...

The filesystem metadata is not corrupt, but we leave unused data, that accumulates.

cc @aduffeck