aiddata / geo-datasets

Scripts for preparing datasets in GeoQuery
http://geoquery.org
MIT License
20 stars 11 forks source link

Viirs ntl develop #164

Closed cmhwang closed 9 months ago

cmhwang commented 1 year ago

attempted initial download portion of script for annual

using https://www.scrapingbee.com/blog/python-wget/ to mimic wget w/ authorization

cmhwang commented 1 year ago

What I have now isn't performing the download but I've set up the basic structure. Looking for something that can allow me to mimic wget with the authorization token

jacobwhall commented 1 year ago

Nice work @cmhwang! Here is an example you might find useful for adding an authorization header to a request using the requests package.

token = "XXXXX"
src_url = "http://example.com"
dst_path = "/path/to/dst"

# dictionary of HTTP headers
headers = {
    "Authorization": f"Bearer {token}",
}

with requests.get(src_url, headers=headers, stream=True) as src:
    # raise an exception (fail this task) if HTTP response indicates that an error occured
    src.raise_for_status()
    with open(dst_path, "wb") as dst:
            dst.write(src.content)

If the file is large enough to use excessive memory while downloading, you can use requests.Response.iter_content() to handle the file in chunks (see this Stack Overflow answer), but that may not be necessary.

cmhwang commented 1 year ago

Still want to move file download options into config and reconfigure so that each file type is its own download entry rather than being compiled into one list

cmhwang commented 1 year ago

Major Updates

finished annual and monthly file downloads

To-Do

cmhwang commented 1 year ago

Update

Monthly download now automated using beautiful soup

Remaining Bugs

-summary wrapping taking an extended period of time

jacobwhall commented 9 months ago

Thanks for your work on this, we needed to update the VIIRS NTL data again this week and your download script succeeded on the first try!