Hey, just wanted to share my experience downloading and using this repo.
Feel free to take with a grain of salt if some of these things are already planning to be changed, but this is what I went through installing the repo, and then trying to get some data.
Chose to use the development environment, just in case
Minor nit: addconda activate rs_tools just before poetry install. This wasn’t a big deal, but may confuse a user if not familiar with anaconda.
When running the poetry install, received this warning:
Warning: The file chosen for install of jupyter-client 8.5.0 (jupyter_client-8.5.0-py3-none-any.whl) is yanked. Reason for being yanked: Bug in kernel env update
Don’t think this was an issue, but wanted to share if you haven’t seen.
From here I was a bit unsure what to do next. I know we discussed the download scripts, so I started there.
Going in alphabetical order, I gave the GOES downloader a try.
Tried python scripts/pipeline/goes/preprocess_goes.py, similar issue with imports,
rioxarray
dask
netCDF4
Note these import errors came at different times, so even after I installed dask, the script ran for about a minute, until I realized the netcdf4 error came.
The .nc files saved, about 1.5 gigs each, it appears their naming convention is based on the date, but I am not really sure since the start and end date are October first to October second? (Assume the date format is YYYY-MM-DD)?
Now all 4 .nc files are saved for goes at the root directory, ideally I have them saved in some folder ready for me to analyze and process them.
I’m not familiar with these file types so I don’t know what I should do with them next, but you can ignore this comment if you think others would.
MODIS Download
python scripts/pipeline/modis/download_modis.py
This runs, but no data is found. Looks like I need to log in to earth access, but I didn’t know
I’m not sure if I have an account, but is there a way to check this before hand? A suggestion I have is to put the credentials in the .env, and put somewhere in the readme that this is needed.
2024-03-24 09:27:30.596 | INFO | __main__:download:112 - Initializing MODIS parameters...
2024-03-24 09:27:30.596 | INFO | __main__:download:121 - Downloading MODIS...
Downloading Terra - Date: 2020-10-01 00:00:00: 0%| | 0/5 [00:00<?, ?it/s]Granules found: 4
'NoneType' object has no attribute 'get'
You must call earthaccess.login() before you can download data
Downloading Terra Cloud Mask - Date: 2020-10-01 00:00:00: 0%| | 0/5 [00:00<?, ?it/s]Granules found: 4
'NoneType' object has no attribute 'get'
You must call earthaccess.login() before you can download data
Downloading Terra - Date: 2020-10-02 00:00:00: 20%|█████████████████████████▌ | 1/5 [00:01<00:07, 1.89s/it]Granules found: 7
'NoneType' object has no attribute 'get'
You must call earthaccess.login() before you can download data
Downloading Terra Cloud Mask - Date: 2020-10-02 00:00:00: 20%|███████████████████████▍ | 1/5 [00:02<00:07, 1.89s/it]Granules found: 7
'NoneType' object has no attribute 'get'
You must call earthaccess.login() before you can download data
Downloading Terra - Date: 2020-10-03 00:00:00: 40%|███████████████████████████████████████████████████▏ | 2/5 [00:03<00:05, 1.99s/it]Granules found: 5
'NoneType' object has no attribute 'get'
You must call earthaccess.login() before you can download data
Downloading Terra Cloud Mask - Date: 2020-10-03 00:00:00: 40%|██████████████████████████████████████████████▊ | 2/5 [00:05<00:05, 1.99s/it]Granules found: 5
'NoneType' object has no attribute 'get'
You must call earthaccess.login() before you can download data
Downloading Terra - Date: 2020-10-04 00:00:00: 60%|████████████████████████████████████████████████████████████████████████████▊ | 3/5 [00:06<00:04, 2.08s/it]Granules found: 7
'NoneType' object has no attribute 'get'
You must call earthaccess.login() before you can download data
Downloading Terra Cloud Mask - Date: 2020-10-04 00:00:00: 60%|██████████████████████████████████████████████████████████████████████▏ | 3/5 [00:07<00:04, 2.08s/it]Granules found: 7
'NoneType' object has no attribute 'get'
You must call earthaccess.login() before you can download data
Downloading Terra - Date: 2020-10-05 00:00:00: 80%|██████████████████████████████████████████████████████████████████████████████████████████████████████▍ | 4/5 [00:08<00:02, 2.10s/it]Granules found: 5
'NoneType' object has no attribute 'get'
You must call earthaccess.login() before you can download data
Downloading Terra Cloud Mask - Date: 2020-10-05 00:00:00: 80%|█████████████████████████████████████████████████████████████████████████████████████████████▌ | 4/5 [00:10<00:02, 2.10s/it]Granules found: 0
Downloading Terra Cloud Mask - Date: 2020-10-05 00:00:00: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:10<00:00, 2.20s/it]
From here I chose to stop going for now, just to make sure I wasn't already veering too far off the rails. Should I be using the notebooks as well?
Hey, just wanted to share my experience downloading and using this repo.
Feel free to take with a grain of salt if some of these things are already planning to be changed, but this is what I went through installing the repo, and then trying to get some data.
Initial Setup
Cloned https://github.com/spaceml-org/rs_tools.git
Chose to use the development environment, just in case
Minor nit: add
conda activate rs_tools
just beforepoetry install
. This wasn’t a big deal, but may confuse a user if not familiar with anaconda.When running the poetry install, received this warning:
Warning: The file chosen for install of jupyter-client 8.5.0 (jupyter_client-8.5.0-py3-none-any.whl) is yanked. Reason for being yanked: Bug in kernel env update
Don’t think this was an issue, but wanted to share if you haven’t seen.From here I was a bit unsure what to do next. I know we discussed the download scripts, so I started there.
Going in alphabetical order, I gave the GOES downloader a try.
GOES Download
From the root dir, I tried
python scripts/pipeline/goes/download_goes.py
(https://github.com/spaceml-org/rs_tools/blob/main/scripts/pipeline/goes/download_goes.py). Despite using poetry, and then trying pip, there were a few packages I needed to manually install.Once those were in I tried the script again, and got an issue about
DownloadParameters
being undefined (https://github.com/spaceml-org/rs_tools/blob/main/scripts/pipeline/goes/download_goes.py#L96)To fix this I tried copying the
DownloadParameters
directly fromdownload_modis.py
intodownload_goes.py
(https://github.com/spaceml-org/rs_tools/blob/main/scripts/pipeline/modis/download_modis.py#L25 ), but now found the issue where region was not specified in download (https://github.com/spaceml-org/rs_tools/blob/main/scripts/pipeline/goes/download_goes.py#L82-L99)I took the default region, added it to the model args, and it looks like after all this, it worked and goes16 data saved!
While it started saving, which didn’t take that long, I tried to get a better understanding of the function, and the parameters.
It looks like the params I imported weren’t used anyway, and all the downloaded was done through the
GOES16Download
, which makes sense. https://github.com/spaceml-org/rs_tools/blob/main/scripts/pipeline/goes/download_goes.py#L105GOES Preprocess
Tried
python scripts/pipeline/goes/preprocess_goes.py
, similar issue with imports,Note these import errors came at different times, so even after I installed dask, the script ran for about a minute, until I realized the netcdf4 error came.
The .nc files saved, about 1.5 gigs each, it appears their naming convention is based on the date, but I am not really sure since the start and end date are October first to October second? (Assume the date format is YYYY-MM-DD)?
Now all 4 .nc files are saved for goes at the root directory, ideally I have them saved in some folder ready for me to analyze and process them.
I’m not familiar with these file types so I don’t know what I should do with them next, but you can ignore this comment if you think others would.
MODIS Download
python scripts/pipeline/modis/download_modis.py
This runs, but no data is found. Looks like I need to log in to earth access, but I didn’t know
I’m not sure if I have an account, but is there a way to check this before hand? A suggestion I have is to put the credentials in the
.env
, and put somewhere in the readme that this is needed.From here I chose to stop going for now, just to make sure I wasn't already veering too far off the rails. Should I be using the notebooks as well?
Thanks again :-)