Open cmarshak opened 3 years ago
@glshort we've added a tqdm
process bar to the HyP3 SDK so you get pretty progress bars in Jupyter Notebooks and text ones everywhere else. E.g.,
https://github.com/ASFHyP3/hyp3-sdk/blob/develop/hyp3_sdk/util.py#L113-L121
One thing to be aware of, is that if you do want to use the Jupyter support, you'll need to be careful with the import b/c you can be running in a Jupyter kernel, but not have all the expected Jupyter dependencies, so we do this: https://github.com/ASFHyP3/hyp3-sdk/blob/develop/hyp3_sdk/util.py#L54-L61
@jhkennedy using your link i customized the download.py script in the download folder so that, if the processes params is 1 (the customizeation doesn't support pool() method) now you can see the progres of the download.
The function to modify is asf_search/download/download.py:
_def download_url(url: str, path: str, filename: str = None, session: ASFSession = None ) -> None: """ Downloads a product from the specified URL to the specified location and (optional) filename. :param url: URL from which to download :param path: Local path in which to save the product :param filename: Optional filename to be used, extracted from the URL by default :param session: The session to use, in most cases should be authenticated beforehand :return: """ if filename is None: filename = os.path.split(urllib.parse.urlparse(url).path)[1] if not os.path.isdir(path): raise ASFDownloadError(f'Error downloading {url}: directory not found: {path}') if os.path.isfile(os.path.join(path, filename)): warnings.warn(f'File already exists, skipping download: {os.path.join(path, filename)}') return if session is None: session = ASFSession() def strip_auth_if_aws(r, *args, **kwargs): if 300 <= r.status_code <= 399 and 'amazonaws.com' in urllib.parse.urlparse(r.headers['location']).netloc: location = r.headers['location'] r.headers.clear() r.headers['location'] = location response = session.get(url, stream=True, hooks={'response': strip_auth_if_aws}) response.raise_for_status() with tqdm.wrapattr(open(os.path.join(path, filename),'wb'), 'write', miniters=1, desc=filename, total=int(response.headers.get('content-length', 0))) as f: for chunk in response.iter_content(chunksize=31457280): f.write(chunk)
Is there still interest in an implementation of this? if nothing else, for the top-level loop that shows the total number of files to download would be only a one line change, and it's nice that now you can disable the progress bars with TQDM_DISABLE=1
. Not sure if you'd want to add the dependency though.
Is your feature request related to a problem? Please describe. Downloading SLC data takes a long time. Would be nice to track.
Describe the solution you'd like Something like this - https://gist.github.com/wy193777/0e2a4932e81afc6aa4c8f7a2984f34e2 - would be happy to make a pull request if the feature would be welcome. Understand if this is out of scope.
Describe alternatives you've considered I think there are other progress bars, but am not familiar with them.
Additional context There is a lot of output during downloading e.g.
Rather be able to track progress more clearly.
edit: realizing this feature request might be a bit hasty as there might be use cases which this is not desired and annoying - but hopefully there might be a smart way to integrate such a feature.