laminlabs / lamindb-setup

Setup & configure LaminDB.
Apache License 2.0
4 stars 2 forks source link

Folder upload #694

Closed falexwolf closed 1 week ago

falexwolf commented 5 months ago

I have this folder:

image

If I save it as an artifact, I get a lot of logging and it seems to upload objects one by one:

image

This isn't the case if I only passed the .zarr folder, in which case I don't get any logging but it seems to upload the folder as a whole.

falexwolf commented 5 months ago

However, timing-wise, it's similar. So, it might just be something in the logging.

Koncopd commented 5 months ago

Yes, we don't log zarr uploads, i will rework the progress bar for folders soon.

Koncopd commented 5 months ago

About the logging, it shows one by one upload indeed for some reason, but it seems they are still concurrent in batches.

Koncopd commented 5 months ago

Could you compare runtimes please UPath("s3://bucket/folder_destination").upload_from("folder_source", recursive=True) vs

destination = UPath("s3://bucket/folder_destination")
source = Path("folder_source")
files = (file for file in source.rglob("*") if file.is_file())
for file in files:
    (destination / file.relative_to(source)).upload_from(file)

to be sure