owncloud / ocis

:atom_symbol: ownCloud Infinite Scale Stack
https://doc.owncloud.com/ocis/next/
Apache License 2.0
1.41k stars 183 forks source link

Fail gracefully if the storage is slow #6975

Open butonic opened 1 year ago

butonic commented 1 year ago

When decomposedfs is interacting with network filesystems like NFS, CephFS, GlusterFS ... it may run into timeouts or network disconnects. To be able to gracefully handle these timeouts we need to implement a timeout. I propose to use a dedicated package that Extends the signature of os.Stat() and other calls to include a context an timeout parameter. One example would be:

package os

import (
    "context"
    "os"
    "time"

    "go.opentelemetry.io/otel"
    "go.opentelemetry.io/otel/attribute"
    "go.opentelemetry.io/otel/codes"
    "go.opentelemetry.io/otel/trace"
)

var tracer trace.Tracer

func init() {
    tracer = otel.Tracer("github.com/cs3org/reva/pkg/os")
}

// Stat returns a FileInfo describing the named file.
// If there is an error, it will be of type *PathError.
func Stat(ctx context.Context, timeout time.Duration, name string) (os.FileInfo, error) {
    _, span := tracer.Start(ctx, "os.Stat")
    defer span.End()
    span.SetAttributes(attribute.String("path", name))

    // Channel used to receive the result from os.Stat function
    ch := make(chan os.FileInfo, 1)
    errCh := make(chan error, 1)

    var cancel context.CancelFunc
    if timeout > 0 {
        ctx, cancel = context.WithTimeout(ctx, timeout)
        defer cancel()
    }

    // Start the doSomething function
    go func(name string) {
        fi, err := os.Stat(name)
        if err == nil {
            ch <- fi
        } else {
            errCh <- err
        }
        close(ch)
        close(errCh)
    }(name)

    select {
    case <-ctx.Done():
        span.SetStatus(codes.Error, ctx.Err().Error())
        return nil, ctx.Err()
    case err := <-errCh:
        span.SetStatus(codes.Error, err.Error())
        return nil, err
    case result := <-ch:
        span.SetStatus(codes.Ok, "")
        return result, nil
    }
}

[... other functions like os.Readfile, os.Open, os.Mkdir ...]

It adds tracing and a timeout mechanism. I kept an explicit timeout parameter because the wrapper handles the timeout itself.

Related: https://github.com/golang/go/issues/20280

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 10 days if no further activity occurs. Thank you for your contributions.