huaweicloud / huaweicloud-sdk-python-obs

Apache License 2.0
73 stars 41 forks source link

Download Whole Bucket using sdk #17

Open RamazanBiyik77 opened 2 years ago

RamazanBiyik77 commented 2 years ago

I want to download whole bucket objects. I have almost 10000 objects on my bucket. I tried to list objects and download them however 1000 object can be listed only. How can i achieve that without using obsutil?

Listing objects: https://support.huaweicloud.com/intl/en-us/api-obs/obs_04_0022.html Downloading objects: https://support.huaweicloud.com/intl/en-us/sdk-python-devg-obs/obs_22_0913.html

liqiuqiu111 commented 6 months ago

参考demo,下载文件夹的方式来下载整个桶内对象。 obsClient = ObsClient(access_key_id=AK, secret_access_key=SK, server=ENDPOINT)

bucket_name = "you-bucket-name" remote_prefix = "remote_prefix"

local_folder = r"you/local/path" page = 1 failed_list = []

prefix_length = len(remote_prefix)

object_list = obsClient.listObjects(bucket_name, prefix=remote_prefix, encoding_type="url")

while True: print("Start to download page %s" % page) page += 1 for obs_object in object_list.body["contents"]: object_key = obs_object["key"]

将 OBS 中的对象名转换为本地路径

    download_file_path = os.path.join(local_folder, object_key[prefix_length + 1:].replace("/", os.sep))
    print("Start to download object [%s] to [%s]" % (object_key, download_file_path))
    try:
        obsClient.downloadFile(bucket_name, object_key, taskNum=10,
                               downloadFile=download_file_path)
    except Exception as e:
        print("Failed to download %s" % object_key)
        failed_list.append(object_key)

# 如果 is_truncated 为 True 则说明全部列举完成,没有剩余
if not object_list.body["is_truncated"]:
    break
# 使用上次返回的 next_marker 作为下次列举的 marker
object_list = obsClient.listObjects(bucket_name, prefix=remote_prefix,
                                    encoding_type="url", marker=object_list.body["next_marker"])

for i in failed_list: print("Failed to download %s, please try again" % i)