Closed chrisflesher closed 4 months ago
Hi @chrisflesher
Is there any convenient way to copy data from one UPath to another?
Not yet. ~To be able to use shutil.copyfile
we first need #145 to be fixed.~
You can use shutil.copyfileobj
>>> import upath
>>> f_in = upath.UPath("file:///tmp/somefile.txt")
>>> f_out = upath.UPath("memory:///output.txt")
>>> import shutil
>>> with f_in.open("rb") as f0, f_out.open("wb") as f1:
... shutil.copyfileobj(f0, f1)
...
>>> f_out.read_text()
'hello world\n'
I ended up writing custom code but was unsure what
chunk_size
to use per write? Is a good value for this depend on the storage type? Like for cloud storage maybe 1460 is a good value because of cloud storage?
You can get the default blocksize for each filesystem via:
>>> import upath
>>> pth = upath.UPath("s3://bucket/somefile.txt")
>>> pth.fs.blocksize
4194304
You can also check out the rsync
and copy functionality in https://github.com/fsspec/filesystem_spec/blob/master/fsspec/generic.py
If you search the filesystem_spec issue tracker for "rsync" you can find a few usage examples.
Let me know if that helps! Andreas 😃
Thank you very much!! Very helpful.
Is there any convenient way to copy data from one UPath to another? I tried using
shutil.copy
but couldn't figure out how to get it working. I ended up writing custom code but was unsure whatchunk_size
to use per write? Is a good value for this depend on the storage type? Like for cloud storage maybe 1460 is a good value because of MTU?