eWaterCycle / era5cli

Command Line Interface to download ERA5 from Copernicus Climate Data Service
https://era5cli.readthedocs.io/
Apache License 2.0
45 stars 8 forks source link

Program terminates abruptly after specifying "No" for overwrite prompt #155

Closed deleoloruntoba closed 11 months ago

deleoloruntoba commented 1 year ago

I am trying to continue an initial download in the same directory. I was prompted if I wanted to overwrite, and I specified no in different ways including "NO", "N", "n" or even "no" but the program stopped with an error: FileExistsError: One or more files already exist in this folder. Please remove them, change to a different folder, or use the --overwrite flag to always overwrite existing files.

Is there no way to ignore the existing files and continue the download?

BSchilperoort commented 1 year ago

Hi, this is the intended behavior. One of the files you wanted to download already existed, and if you don't want to overwrite the files the download request is not made.

Skipping only some downloads can get tricky, as files can be incomplete if you terminate a request. These incomplete files will then not be overwritten.

Due to the way the CDS API works it's not trivial to validate the existing files. There is no checksum we can compare the files to.

BSchilperoort commented 1 year ago

It turns out that I was wrong and we can validate files.

import cdsapi
c = cdsapi.Client()
r = c.retrieve(...) # make a request and start the downloading process
r.content_length # bytes
>>> 65337064

r.content_length will return the size of the file in bytes. This can be compared to Path(...).stat().st_size.

This way we can skip existing files if they match the right content_length, and otherwise overwrite them.

deleoloruntoba commented 1 year ago

Thanks a lot.