Closed sigmafelix closed 1 month ago
@sigmafelix
I believe it is due to the filelist
already containing the base url before it is used in the subsequent sprintf()
command.
filelist <-
rvest::read_html(filedir_url) |>
rvest::html_elements("tr") |>
rvest::html_attr("data-path")
filelist_sub <-
grep(
paste0("(", paste(tiles_requested, collapse = "|"), ")"),
filelist,
value = TRUE
)
download_url <- sprintf("%s%s", ladsurl, filelist_sub)
Running with a debug function which returns the ladsurl
, filelist_sub
, and download_url
, it is clear that filelist_sub
already contains the base url. I will debug by removing the double-paste and re-run tests.
> download_modis_debug(
+ product = "MOD09GA",
+ version = "61",
+ horizontal_tiles = c(7, 8),
+ vertical_tiles = c(3, 4),
+ date = "2018-01-01",
+ nasa_earth_data_token = readLines("~/nasa_token.txt"),
+ directory_to_save = path,
+ acknowledgement = TRUE,
+ download = FALSE,
+ remove_command = FALSE
+ )
1 / 1 days of data available in the queried dates.
ladsurl:
[[1]]
[1] "https://ladsweb.modaps.eosdis.nasa.gov/"
filelist_sub:
[[2]]
[1] "https://ladsweb.modaps.eosdis.nasa.gov/archive/allData/61/MOD09GA/2018/001/MOD09GA.A2018001.h07v03.061.2021295010220.hdf"
[2] "https://ladsweb.modaps.eosdis.nasa.gov/archive/allData/61/MOD09GA/2018/001/MOD09GA.A2018001.h08v03.061.2021295010420.hdf"
[3] "https://ladsweb.modaps.eosdis.nasa.gov/archive/allData/61/MOD09GA/2018/001/MOD09GA.A2018001.h08v04.061.2021295010503.hdf"
download_url:
[[3]]
[1] "https://ladsweb.modaps.eosdis.nasa.gov/https://ladsweb.modaps.eosdis.nasa.gov/archive/allData/61/MOD09GA/2018/001/MOD09GA.A2018001.h07v03.061.2021295010220.hdf"
[2] "https://ladsweb.modaps.eosdis.nasa.gov/https://ladsweb.modaps.eosdis.nasa.gov/archive/allData/61/MOD09GA/2018/001/MOD09GA.A2018001.h08v03.061.2021295010420.hdf"
[3] "https://ladsweb.modaps.eosdis.nasa.gov/https://ladsweb.modaps.eosdis.nasa.gov/archive/allData/61/MOD09GA/2018/001/MOD09GA.A2018001.h08v04.061.2021295010503.hdf"
I also propose to use the directory/Year/Julian/*.hdf
file path as the saving directory. This will match the structure of the /ddn/gs1/group/set/Projects/NRT-AP-Model/input/modis/.../
folder, making it easier for new data for the pipeline.
Debug version runs as expected for MOD09GA - still need to check the MOD06_L2 versioning.
> download_modis_debug(
+ product = "MOD09GA",
+ version = "61",
+ horizontal_tiles = c(7, 8),
+ vertical_tiles = c(3, 4),
+ date = "2018-01-01",
+ nasa_earth_data_token = readLines("~/nasa_token.txt"),
+ directory_to_save = path,
+ acknowledgement = TRUE,
+ download = TRUE,
+ remove_command = TRUE
+ )
1 / 1 days of data available in the queried dates.
Downloading requested files...
--2024-10-07 12:59:14-- https://ladsweb.modaps.eosdis.nasa.gov/archive/allData/61/MOD09GA/2018/001/MOD09GA.A2018001.h07v03.061.2021295010220.hdf
Resolving ladsweb.modaps.eosdis.nasa.gov (ladsweb.modaps.eosdis.nasa.gov)... 198.118.194.40, 2001:4d0:241a:40c0::40
Connecting to ladsweb.modaps.eosdis.nasa.gov (ladsweb.modaps.eosdis.nasa.gov)|198.118.194.40|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 48202716 (46M) [application/octet-stream]
Saving to: ‘/ddn/gs1/home/manwareme/data/modis/MOD09GA/MOD09GA.A2018001.h07v03.061.2021295010220.hdf’
09GA/MOD09GA.A2018001.h07v03 88%[======================================> ] 40.55M 3.21MB/s eta 2s
@mitchellmanware Thank you for the fix!
When I tried
download_modis
with the following code just now, no files were downloaded with repeated messages:Code
Error message
Apparently the base URL is included twice in the
sink
command. I will investigate this issue.