Open ermolaev94 opened 2 months ago
@ermolaev94 is it S3-compatible storage? (yandex cloud or something)? Just curious if it is something specific about them ....
According to the aws docs, it looks like multipart_chunksize
takes either the size in bytes or else requires a size suffix, so could it be as simple as needing to set multipart_chunksize = 512MB
?
@ermolaev94 is it S3-compatible storage? (yandex cloud or something)? Just curious if it is something specific about them ....
It's yandex-s3, single file limit is 5Tb.
According to the aws docs, it looks like
multipart_chunksize
takes either the size in bytes or else requires a size suffix, so could it be as simple as needing to setmultipart_chunksize = 512MB
?
Hm, thx, run this command. I will return with the update ASAP.
According to the aws docs, it looks like
multipart_chunksize
takes either the size in bytes or else requires a size suffix, so could it be as simple as needing to setmultipart_chunksize = 512MB
?
I've tried your suggestion and error is still the same.
My config file for AWS is the following:
[default]
region = ru-central1
s3 =
multipart_chunksize = 512MB
I've generated huge file with the following command:
$ dd if=/dev/urandom of=large_file.bin bs=1M count=1228800
File is ~1.1Tb, count of chunks with the single chunk size = 512Mb should be approximately <2300.
Then I've run dvc add & push:
$ dvc add large_file.bin
$ dvc push large_file.bin.dvc
...
Argument partNumber must be an integer between 1 and 10000.: An error occurred (InvalidArgument) when calling the UploadPart operation: Argument partNumber must be an integer between 1 and 10000.
and have got the same issue.
Then I've tried to push via AWS-CLI:
$ aws --endpoint-url=https://storage.yandexcloud.net/ s3 cp large_file.bin s3://<bucket-name>/large_file.bin
and it works fine
I suppose that aws cp
works not in the same way as dvc push
does, but I didn't find exact command in dvc-s3
package to repeat. Anyway, it looks like there is a bug.
Overview
Pushing large files under S3 bucker leads to the following error:
I've tried to fix situation by settin chunk size according to the AWS documentation:
It does not help. I've tried to debug
dvc-s3
and cheked that argument is read, but it's not clear how it is used. I've noticed that "s3" config stayed empty, while "self._transfer_config" has updated.Problem starting from 800Gb file size.