containerd / overlaybd

Overlaybd: a block based remote image format. The storage backend of containerd/accelerated-container-image.
Apache License 2.0
259 stars 58 forks source link

Modify DADI's dadi-p2proxy to support for FaaSNet #316

Open qi0523 opened 9 months ago

qi0523 commented 9 months ago

My understanding is as follows.

DADI's p2proxy supports downloading back to the source(image registry). For example, vm2 sends a layer block request to vm1, if vm1 has no the block, finally it can get the block from registry.

As for FaaSNet' p2p, assuming there is such a pipeline: registry->vm0->vm1->vm2. So vm2 sends a layer block request to vm1, vm1 will only responds to the request after downloading from VM0.

So to support for FaaSNet, i use golang's conditional variable to build sync lib.

import (
    "sync"
)

type Synclib interface {
    Wait(key string)
    Broadcast(key string)
}

type syCache struct {
    lock         sync.Mutex
    waitMap      map[string]*sync.Cond
    broadcastMap map[string]struct{}
}

// NewSyCache
func NewSyCache() Synclib {
    return &syCache{
        waitMap:      make(map[string]*sync.Cond),
        broadcastMap: make(map[string]struct{}),
    }
}

// key := filepath.Join(r.path, strconv.FormatInt(offset, 10))
func (sc *syCache) Wait(key string) {
    sc.lock.Lock()
    defer sc.lock.Unlock()

    if _, ok := sc.broadcastMap[key]; ok {
        return
    }

    val, ok := sc.waitMap[key]

    if ok {
        val.Wait()
        return
    }

    // first one
    cond := sync.NewCond(&sc.lock)
    sc.waitMap[key] = cond

    cond.Wait()
}

func (sc *syCache) Broadcast(key string) {
    sc.lock.Lock()
    defer sc.lock.Unlock()

    sc.broadcastMap[key] = struct{}{}

    val, ok := sc.waitMap[key]

    if !ok {
        return
    }

    val.Broadcast()

    delete(sc.waitMap, key)
}

When vm1's p2proxy server receives vm2's request, if vm1 has no the layer block, vm1's p2pHandler goroutineA will invoke Wait() until vm1 fileCacheItem.Fill(fetch) from vm0. After fetching the layer block, vm1's goroutineB will invoke Broadcast() to wake up goroutineA.

My problem is Creating container takes too long. Assuming there is such a pipeline.

                      ----> vm2
registry----->vm1
                      ----> vm3

vm1 took 5191ms(commonly takes 2200ms from registry) to create a container, and vm2 and vm3 took 14520ms to create the same container. time sudo nerdctl run --snapshotter=overlaybd xxx

I cat /var/log/overlaybd.log and find that vm2 and vm3 open dev took ~1700ms(log output). I have no idea what causes the problem.

qi0523 commented 9 months ago

/etc/overlaybd/overlaybd.json:

{
    "logConfig": {
        "logLevel": 1,
        "logPath": "/var/log/overlaybd.log"
    },
    "cacheConfig": {
        "cacheType": "file",
        "cacheDir": "/opt/overlaybd/registry_cache",
        "cacheSizeGB": 4
    },
    "gzipCacheConfig": {
        "enable": true,
        "cacheDir": "/opt/overlaybd/gzip_cache",
        "cacheSizeGB": 4
    },
    "credentialConfig": {
        "mode": "file",
        "path": "/opt/overlaybd/cred.json"
    },
    "ioEngine": 0,
    "download": {
        "enable": false,
        "delay": 600,
        "delayExtra": 30,
        "maxMBps": 100
    },
    "p2pConfig": {
        "enable": true,
        "address": "localhost:30000/dadip2p"
    },
    "exporterConfig": {
        "enable": false,
        "uriPrefix": "/metrics",
        "port": 9863,
        "updateInterval": 60000000
    },
    "enableAudit": true,
    "auditPath": "/var/log/overlaybd-audit.log"
}