fsspec / universal_pathlib

pathlib api extended to use fsspec backends
MIT License
211 stars 36 forks source link

Copy UPath contents #175

Closed chrisflesher closed 4 months ago

chrisflesher commented 4 months ago

Is there any convenient way to copy data from one UPath to another? I tried using shutil.copy but couldn't figure out how to get it working. I ended up writing custom code but was unsure what chunk_size to use per write? Is a good value for this depend on the storage type? Like for cloud storage maybe 1460 is a good value because of MTU?

def copy(source: upath.UPath, target: upath.UPath, chunk_size: int = 1024) -> None:
    with target.open('wb') as target_file, source.open('rb') as source_file:
        while True:
            data = source_file.read(chunk_size)
            if not data:
                break
            target_file.write(data)
ap-- commented 4 months ago

Hi @chrisflesher

Is there any convenient way to copy data from one UPath to another?

Not yet. ~To be able to use shutil.copyfile we first need #145 to be fixed.~

You can use shutil.copyfileobj

>>> import upath
>>> f_in = upath.UPath("file:///tmp/somefile.txt")
>>> f_out = upath.UPath("memory:///output.txt")
>>> import shutil
>>> with f_in.open("rb") as f0, f_out.open("wb") as f1:
...     shutil.copyfileobj(f0, f1)
... 
>>> f_out.read_text()
'hello world\n'

I ended up writing custom code but was unsure what chunk_size to use per write? Is a good value for this depend on the storage type? Like for cloud storage maybe 1460 is a good value because of cloud storage?

You can get the default blocksize for each filesystem via:

>>> import upath
>>> pth = upath.UPath("s3://bucket/somefile.txt")
>>> pth.fs.blocksize
4194304

You can also check out the rsync and copy functionality in https://github.com/fsspec/filesystem_spec/blob/master/fsspec/generic.py If you search the filesystem_spec issue tracker for "rsync" you can find a few usage examples.

Let me know if that helps! Andreas 😃

chrisflesher commented 4 months ago

Thank you very much!! Very helpful.