asfadmin / Discovery-asf_search

BSD 3-Clause "New" or "Revised" License
130 stars 45 forks source link

[Feature] - nice progress bar during downloading of SLC data #56

Open cmarshak opened 3 years ago

cmarshak commented 3 years ago

Is your feature request related to a problem? Please describe. Downloading SLC data takes a long time. Would be nice to track.

Describe the solution you'd like Something like this - https://gist.github.com/wy193777/0e2a4932e81afc6aa4c8f7a2984f34e2 - would be happy to make a pull request if the feature would be welcome. Understand if this is out of scope.

Describe alternatives you've considered I think there are other progress bars, but am not familiar with them.

Additional context There is a lot of output during downloading e.g.

response: 307
Redirect to https://sentinel1.asf.alaska.edu/SLC/SB/S1B_IW_SLC__1SDV_20210723T014947_20210723T015014_027915_0354B4_B3A9.zip
response: 302
Redirect to https://dy4owt9f80bz7.cloudfront.net/...
response: 200

Rather be able to track progress more clearly.

edit: realizing this feature request might be a bit hasty as there might be use cases which this is not desired and annoying - but hopefully there might be a smart way to integrate such a feature.

jhkennedy commented 3 years ago

@glshort we've added a tqdm process bar to the HyP3 SDK so you get pretty progress bars in Jupyter Notebooks and text ones everywhere else. E.g., https://github.com/ASFHyP3/hyp3-sdk/blob/develop/hyp3_sdk/util.py#L113-L121

One thing to be aware of, is that if you do want to use the Jupyter support, you'll need to be careful with the import b/c you can be running in a Jupyter kernel, but not have all the expected Jupyter dependencies, so we do this: https://github.com/ASFHyP3/hyp3-sdk/blob/develop/hyp3_sdk/util.py#L54-L61

carmine1990 commented 2 years ago

immagine

@jhkennedy using your link i customized the download.py script in the download folder so that, if the processes params is 1 (the customizeation doesn't support pool() method) now you can see the progres of the download.

The function to modify is asf_search/download/download.py:

_def download_url(url: str, path: str, filename: str = None, session: ASFSession = None ) -> None: """ Downloads a product from the specified URL to the specified location and (optional) filename. :param url: URL from which to download :param path: Local path in which to save the product :param filename: Optional filename to be used, extracted from the URL by default :param session: The session to use, in most cases should be authenticated beforehand :return: """ if filename is None: filename = os.path.split(urllib.parse.urlparse(url).path)[1] if not os.path.isdir(path): raise ASFDownloadError(f'Error downloading {url}: directory not found: {path}') if os.path.isfile(os.path.join(path, filename)): warnings.warn(f'File already exists, skipping download: {os.path.join(path, filename)}') return if session is None: session = ASFSession() def strip_auth_if_aws(r, *args, **kwargs): if 300 <= r.status_code <= 399 and 'amazonaws.com' in urllib.parse.urlparse(r.headers['location']).netloc: location = r.headers['location'] r.headers.clear() r.headers['location'] = location response = session.get(url, stream=True, hooks={'response': strip_auth_if_aws}) response.raise_for_status() with tqdm.wrapattr(open(os.path.join(path, filename),'wb'), 'write', miniters=1, desc=filename, total=int(response.headers.get('content-length', 0))) as f: for chunk in response.iter_content(chunksize=31457280): f.write(chunk)

scottstanie commented 9 months ago

Is there still interest in an implementation of this? if nothing else, for the top-level loop that shows the total number of files to download would be only a one line change, and it's nice that now you can disable the progress bars with TQDM_DISABLE=1. Not sure if you'd want to add the dependency though.