livekit / egress

Export and record WebRTC sessions and tracks
https://blog.livekit.io/livekit-universal-egress-launch/
Apache License 2.0
184 stars 74 forks source link

[BUG] Backup fails across mount points #548

Open j1elo opened 11 months ago

j1elo commented 11 months ago

Describe the bug

The backup feature (backup_storage) fails if the destination path is in a different mount point than the files (e.g. a different Docker bind mount or volume).

This happens because os.Rename as used here is just a call to the renameat syscall, which doesn't allow moving across partitions.

The error is EXDEV, which Go translates into "invalid cross-device link". See os.Rename between different partitions:

You can't "move" a file on different partitions, because you have to copy bytes around. If you want to move or rename a file, you are talking about a file system just pointing to the same bytes, just from a different location. Thus if you get that error you should just copy and delete the file.

Thus I guess the error could be captured and a different move method attempted, kind of

if e, ok := err.(*os.LinkError); ok && e.Err == syscall.EXDEV {
    // Retry with a different way of moving.
}

But as it is right now, such a move will cause the error posted below, regardless of having correct write permissions on the destination path.

Egress Version https://hub.docker.com/r/livekit/egress/ v1.8.0

Egress Request N/A

Additional context /backup_storage/ is a Docker bind-mount to the host system, where recording backups should be stored.

Logs

rename /home/egress/tmp/<file.webm> /backup_storage/<file.webm>: invalid cross-device link
a-marchenko commented 9 months ago

We ran into a similar problem

It seems worth checking the backup folder on application startup as it is done with cpu cost

2024-01-26T06:04:44.211Z    DEBUG   egress  sink/file.go:72 removing temporary directory    {"nodeID": "NE_cuiBEQs2wyZS", "handlerID": "EGH_KkLSksr5NuNW", "clusterID": "", "egressID": "EG_Fn6z3ZbmtRYf", "path": "/home/egress/tmp/EG_Fn6z3ZbmtRYf/"}

2024-01-26T06:04:44.276Z    WARN    egress  service/handler.go:229  egress failed   {"nodeID": "NE_cuiBEQs2wyZS", "handlerID": "EGH_KkLSksr5NuNW", "clusterID": "", "egressID": "EG_Fn6z3ZbmtRYf", "egressID": "EG_Fn6z3ZbmtRYf", "requestType": "room_composite", "outputType": "file", "error": "rename /home/egress/tmp/EG_Fn6z3ZbmtRYf/2024-01-26T045922.mp4 /stucked/fa5cc19b-6c7e-4f18-95b6-695fc0bc5b1c/2024-01-26T045922.mp4: no such file or directory"}