peak / s5cmd

Parallel S3 and local filesystem execution tool.
MIT License
2.71k stars 239 forks source link

unexpected EOF when sync/cp files. #689

Open PDD777 opened 10 months ago

PDD777 commented 10 months ago

I'm copying 2 dirs within the bucket, and some files I would get an "unexpected EOF" error, this was using the sync command, with only the --endpoint-url set, as we use Linode and not the very very pricey AWS.

Thinking that it might be the files, I tried a few files individually, no error, which confirms, not the file or the endpoints.

So I moved to the cp -n -u -s command, got this today.

Same "unexpected EOF" error on some files, but it spat out this as well.

panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0xa35623] goroutine 81859 [running]: github.com/peak/s5cmd/v2/command.Copy.shouldOverride({0xc0000a1400, 0xc0000a15e0, {0xb8d255, 0x2}, {0xc0000dcfc0, 0x68}, 0x0, 0x1, 0x1, 0x1, ...}, ...) /home/runner/work/s5cmd/s5cmd/command/cp.go:850 +0x1a3 github.com/peak/s5cmd/v2/command.Copy.doDownload({0xc0000a1400, 0xc0000a15e0, {0xb8d255, 0x2}, {0xc0000dcfc0, 0x68}, 0x0, 0x1, 0x1, 0x1, ...}, ...) /home/runner/work/s5cmd/s5cmd/command/cp.go:613 +0x118 github.com/peak/s5cmd/v2/command.Copy.prepareDownloadTask.func1() /home/runner/work/s5cmd/s5cmd/command/cp.go:567 +0xf3 github.com/peak/s5cmd/v2/parallel.(Manager).Run.func1() /home/runner/work/s5cmd/s5cmd/parallel/parallel.go:57 +0x8a created by github.com/peak/s5cmd/v2/parallel.(Manager).Run /home/runner/work/s5cmd/s5cmd/parallel/parallel.go:53 +0xca

It looks like a mem leak, and for large copies the application fails to release malloc to proceed in the cp process.

The system we are running this on has: RAM: 8GB CPU Cores: 2x2 Node: Proxmox VM

s5cmd version v2.2.2-48f7e59

We are trying to cp 6+TB/25000 objects as an offsite backup location from s3 to local NFS store.

amoosebitmymom commented 6 months ago

Experiencing the same error. Using different concurrency flags / different size buckets didn't help (bucket size ranging from 5Gi to 60Gi). I couldn't find any evidence of a memory leak.

The command was run through the image, using the same above version with 64 CPU and 128Gi of RAM, on openshift.

amoosebitmymom commented 6 months ago

Experiencing the same error. Using different concurrency flags / different size buckets didn't help (bucket size ranging from 5Gi to 60Gi). I couldn't find any evidence of a memory leak.

The command was run through the image, using the same above version with 64 CPU and 128Gi of RAM, on openshift.

Updating: my problem was caused due to my object storage provider limiting the amount of connections to a bucket at the same time. When the concurrency rate surpassed it, that's when the error started.