Open colemickens opened 4 years ago
also, just to be sure, does this work with Page blobs?
It seems like maybe it only works for block blobs, ignores the page blob type argument, and then hands off to the sdk that just uploads without trying to skip empty sections: https://github.com/Azure/azure-storage-azcopy/blob/25635976913d156222cffec8ca3693fe6a0afb65/cmd/copy.go#L982
It would be insanely useful for this feature to work correctly with page blobs...
This is the scenario. So far only blobxfer can do it that I have found:
+ zstdcat /tmp/nix-shell.LL3PRa/tmp.5NK8mbUwCW/disk.vhd.zstd
+ blobxfer upload --storage-url 'https://job13211.blob.core.windows.net/vhd/20.09.20200729.d3ff247.vhd?se=2020-07-30T10%3A48Z&sp=racwd&sv=2018-11-09&sr=b&sig=REDACTED%3D' --local-path -
2020-07-30 02:48:51.467 DEBUG - credential: account=job13211 endpoint=core.windows.net is_sas=True can_create_containers=False can_list_container_objects=False can_read_object=True can_write_object=True
2020-07-30 02:48:51.469 INFO -
============================================
Azure blobxfer parameters
============================================
blobxfer version: 1.9.4
platform: Linux-5.7.10-x86_64-with-glibc2.2.5
components: CPython=3.8.3-64bit azstor.blob=2.1.0 azstor.file=2.1.0 crypt=2.9.2 req=2.23.0
transfer direction: local -> Azure
workers: disk=16 xfer=32 md5=0 crypto=0
log file: None
dry run: False
resume file: None
timeout: connect=10 read=200 max_retries=1000
mode: StorageModes.Auto
skip on: fs_match=False lmt_ge=False md5=False
delete: extraneous=False only=False
overwrite: True
recursive: True
rename single: False
strip components: 0
access tier: None
chunk size bytes: 0
one shot bytes: 0
store properties: attr=False cc='' ct=<mime> md5=False
rsa public key: None
local source paths: -
============================================
2020-07-30 02:48:51.469 INFO - blobxfer start time: 2020-07-30 02:48:51.469250-07:00
2020-07-30 02:48:51.469 DEBUG - spawning 16 disk threads
2020-07-30 02:48:51.481 DEBUG - spawning 32 transfer threads
2020-07-30 02:48:51.489 DEBUG - 0 files 0.0000 MiB filesize, lmt_ge, or no overwrite skipped
2020-07-30 02:48:51.489 DEBUG - 1 local files processed, waiting for upload completion of approx. 0.0000 MiB
2020-07-30 02:49:45.815 INFO - elapsed upload + verify time and throughput of 48.8281 GiB: 54.329 sec, 7362.6139 Mbps (920.327 MiB/s)
2020-07-30 02:49:45.815 INFO - blobxfer end time: 2020-07-30 02:49:45.815782-07:00 (elapsed: 54.347 sec)
This allows me to have a 100GB that never has to touch the disk in full size, uploads quickly (1min = skipping blank sections), and uploads from stdin to a page blob successfully.
However, as I often do with python, I hit a packaging issue with blobxfer, so I'd love to have this functionality available in azcopy
. Thanks so much!!
Had same problem, though my compressor was the lowly gzip. The packaging issue for blobxfer seems to have been fixed, but I then ran across https://github.com/Azure/blobxfer/issues/144 when trying to use blobxfer. :/
Gotta love Azure.
Which version of the AzCopy was used?
Note: The version is visible when running AzCopy without any argument
I changed how I'm uploading. Azure disks are a bit slow to resize, so I'm uploading pre-sized images. However, the build artifacts are huge, so I store them
zstd
compressed to save massive amounts of space.So, I'd like to be able to upload like this:
However, when I do this, my upload duration goes through the roof. It seems like maybe azcopy is no longer intelligently skipping over the empty chunks.
Which platform are you using? (ex: Windows, Mac, Linux)
Linux
What command did you run?
Note: Please remove the SAS to avoid exposing your credentials. If you cannot remember the exact command, please retrieve it from the beginning of the log file.
What problem was encountered?
How can we reproduce the problem in the simplest way?
Pipe a huge VHD with huge amounts of blank space into
azcopy copy
.Have you found a mitigation/solution?
Extracting to disk and then uploading, but I'd prefer not to.