chrisvwn / Rnightlights

R package to extract data from satellite nightlights.
GNU General Public License v3.0
47 stars 14 forks source link

'exdir' does not exist #47

Closed pk1196 closed 4 years ago

pk1196 commented 4 years ago

hi, i get the following error when i try to run this for IDN:

ctry <- "IDN" #replace to run for any other country

download and process monthly VIIRS stats at the highest admin level

highestAdmLevelStats <- getCtryNlData(ctryCode = ctry, admLevel = "highest", nlType = "VIIRS.M", nlPeriods = nlRange("201401", "201412","VIIRS.M"), nlStats = list("sum",na.rm=TRUE), ignoreMissing=FALSE) 2019-12-03 13:27:14: Downloading ctry poly: IDN 2019-12-03 13:27:14: Downloading ctry shpZip: IDN 2019-12-03 13:27:14: Downloading https://biogeo.ucdavis.edu/data/gadm3.6/shp/gadm36_IDN_shp.zip trying URL 'https://biogeo.ucdavis.edu/data/gadm3.6/shp/gadm36_IDN_shp.zip' Content type 'application/zip' length 161235162 bytes (153.8 MB) downloaded 153.8 MB

Error in utils::unzip(zipfile = polyFnameZip, junkpaths = TRUE, exdir = polyFnamePath) : 'exdir' does not exist

chrisvwn commented 4 years ago

Hi,

Could you check that the polygon folder exists. You can find the path with:

library(Rnightlights)
getNlDir('dirPolygon')

If it does not exist please run: Rnightlights:::createNlDataDirs()

pk1196 commented 4 years ago

hi Chris, thanks so much for your speedy response and for the heads up on the issue with the map. I checked on the polygon folder, and it does exist.

chrisvwn commented 4 years ago

Hmm. Okay. What version of Rnightlights are you running?

pk1196 commented 4 years ago

0.2.4. this is the command I used to install it. packageurl <- "https://cran.r-project.org/src/contrib/Archive/Rnightlights/Rnightlights_0.2.4.tar.gz" install.packages(packageurl, repos=NULL, type="source")

chrisvwn commented 4 years ago

Could you try deleting any folder or file with IDN in the name from the polygon folder we checked earlier, then try again?

pk1196 commented 4 years ago

I tried this earlier and it unfortunately did not work.

chrisvwn commented 4 years ago

Okay. What is your polygon path as printed by the previous command?

chrisvwn commented 4 years ago

No problem. I just needed to see the structure.

Seems like this might be a bit of a problem with unzip referencing network drives. Is xxx an IP address or hostname? I want to simulate a similar environment here.

pk1196 commented 4 years ago

No I think that's the name of the network drive

chrisvwn commented 4 years ago

Okay. What is your current working directory as given by getwd()?

pk1196 commented 4 years ago

C:/WINDOWS/system32

chrisvwn commented 4 years ago

Okay. Thanks. This is the first time I am encountering this issue, so please bear with me.

Did you set the dataPath when you installed the package? Or did you run setupDataPath()?

pk1196 commented 4 years ago

no worries, I am rather new to R so I'm very grateful for your help! no, I don't think I did either of those things. definitely not the latter.

chrisvwn commented 4 years ago

Okay. So then the package was probably installed by someone else maybe the IT dept? I am asking to try to get a feel for how the package is accessing the data folder.

Also, could you try to run listNlTiles() and listCtryNlData()

chrisvwn commented 4 years ago

One more thing you can try is to map the drive to a local drive or folder and then set up this as the data path by running setupDataPath()

pk1196 commented 4 years ago

I installed the package but R and Rstudio were installed by the IT dept - not sure if the path could have been set then. I don't recall having to set any path while installing the package. I tried running listNlTiles() and listCtryNlData() and both return NULL. sorry if this is a silly question but how do I run setupDataPath() ? I am prompted to select 1 or 2 and I select 2 to choose a different directory but nothing happens after I type 2 and hit enter.

pk1196 commented 4 years ago

Ah my bad, the dialog box was frozen behind the Rstudio window. changed the location and that seems to have done the trick! thank you so much I really appreciate your time and help.

chrisvwn commented 4 years ago

Great! I am glad that worked. Yes, sometimes the dialog box does that and this file/path selection bit needs some work. At some point I hope to look at whether network drives can be used directly.

pk1196 commented 4 years ago

Hi Chris, sorry to bother you again but I was in the process of downloading the tiles and got this error: 2019-12-05 17:18:55: PROCESSING nlType:VIIRS.MconfigName: VCMCFG nlPeriod:201908 2019-12-05 17:18:55: Checking tiles required for VIIRS.M 201908 2019-12-05 17:18:55: IND: Stats missing. Adding tiles 2019-12-05 17:18:55: numTiles: 1, Required tiles: 75N060E 2019-12-05 17:18:55: Downloading tile: 2019083 2019-12-05 17:18:55: Tile not available on the NOAA page. Please manually check for the 201908 tile at 'https://ngdc.noaa.gov/eog/viirs/download_dnb_composites.html'. If it exists please report this as a bug 2019-12-05 17:18:55: Something went wrong with the tile downloads. Aborting ... Error in getCtryNlData(ctryCode = ctry, admLevel = "highest", nlType = "VIIRS.M", : 2019-12-05 17:18:57: An error occurred

I went to the website provided to check for the tile and noticed that the most recent tile available was 201904, but I had managed to download up to 201907 before this error occurred. This website has been updated for the most recent tiles until 201909 https://eogdata.mines.edu/download_dnb_composites.html. Is it possible to extract them from here?

pk1196 commented 4 years ago

Another issue I am encountering is the formatting of the period as date. I get this error:

Warning message: All formats failed to parse. No formats found.

I just keyed in highestAdmLevelStats$nlperiod to check if this was what was causing the issue, and I got a bunch of NA values in return. The ymd function works fine otherwise when I replace highestAdmLevelStats$nlperiod with 201904 for example. Would really appreciate your help with this, thank you!

chrisvwn commented 4 years ago

Is it possible to extract them from here?

The url changed after v0.2.4 went out so is only updated in the github version. Yes, you can update the link by running:

pkgOptions(ntLtsIndexUrlVIIRS.M="https://eogdata.mines.edu/download_dnb_composites_iframe.html")

chrisvwn commented 4 years ago

Another issue I am encountering is the formatting of the period as date

Where are you receiving this error from? The nlPeriods do not qualify as dates so will not parse directly. Some manipulation is required. Essentially, for the monthly nlPeriods you can add a day date e.g. 01 at the end of the date and use ymd as you are doing.

Looking at the example given on github this line does the formatting and conversion to date:

highestAdmLevelStats$nlPeriod <- 
  ymd(paste0(substr(highestAdmLevelStats$nlPeriod, 1,4), 
                      "-",substr(highestAdmLevelStats$nlPeriod, 5,6), "-01"))

Running lubridate::ymd directly on an nlPeriod may produce inaccurate dates e.g. lubridate::ymd("201204") gives "2020-12-04" instead of "2012-04-01".

pk1196 commented 4 years ago

Yes, however this is the line I am receiving that error from.

format period as date

highestAdmLevelStats$nlPeriod <- ymd(paste0(substr(highestAdmLevelStats$nlPeriod, 1,4), "-",substr(highestAdmLevelStats$nlPeriod, 5,6), "-01"))

chrisvwn commented 4 years ago

Oh I know why! So sorry. Since the names of the columns in the result have changed, the example is outdated. I need to update the documentation.

The reason why you're getting NAs is because this line is not pulling out the nlPeriod from the col name:

#extract date from the NL col names
highestAdmLevelStats$nlPeriod <- substr(highestAdmLevelStats$nlPeriod, 12, 17)

Try this instead:

#extract date from the NL col names
highestAdmLevelStats$nlPeriod <- stringr::str_extract(highestAdmLevelStats$nlPeriod, "\\d+")
pk1196 commented 4 years ago

Thank you very much, that solved the date issue. However, pkgOptions(ntLtsIndexUrlVIIRS.M="https://eogdata.mines.edu/download_dnb_composites_iframe.html") doesn't seem to work for the remaining downloads. I still get the same error

library(Rnightlights) library(lubridate) library(reshape2)

pkgOptions(ntLtsIndexUrlVIIRS.M="https://eogdata.mines.edu/download_dnb_composites_iframe.html")

(Optional performance enhancement if you have aria2c and gdal installed)

pkgOptions(downloadMethod = "aria", cropMaskMethod = "gdal", extractMethod = "gdal", deleteTiles = TRUE)

Optional performance enhancement. If extractMethod="rast" you can specify the number of

CPU cores to use in parallel

pkgOptions(extractMethod = "rast", numCores=4)

ctry <- "IND" #replace to run for any other country

download and process monthly VIIRS stats at the highest admin level

highestAdmLevelStats <- getCtryNlData(ctryCode = ctry, admLevel = "highest", nlType = "VIIRS.M", nlPeriods = nlRange("201204", "201909","VIIRS.M"), nlStats = list("sum",na.rm=TRUE), ignoreMissing=FALSE)

2019-12-06 11:49:58: PROCESSING nlType:VIIRS.MconfigName: VCMCFG nlPeriod:201908 2019-12-06 11:49:58: Checking tiles required for VIIRS.M 201908 2019-12-06 11:49:58: IND: Stats missing. Adding tiles 2019-12-06 11:49:58: numTiles: 1, Required tiles: 75N060E 2019-12-06 11:49:58: Downloading tile: 2019083 2019-12-06 11:49:58: Tile not available on the NOAA page. Please manually check for the 201908 tile at 'https://ngdc.noaa.gov/eog/viirs/download_dnb_composites.html'. If it exists please report this as a bug 2019-12-06 11:49:58: Something went wrong with the tile downloads. Aborting ... Error in getCtryNlData(ctryCode = ctry, admLevel = "highest", nlType = "VIIRS.M", : 2019-12-06 11:50:00: An error occurred

Also, just wanted to confirm - the csv file named NL_DATA_IND_ADM1_GADM-3.6-SHPZIP in the data folder contains the radiance sums for each state? Would it be right to say that adding all of them would give the radiance sum for the entire country in that month?

chrisvwn commented 4 years ago

pkgOptions(ntLtsIndexUrlVIIRS.M="https://eogdata.mines.edu/download_dnb_composites_iframe.html") doesn't seem to work for the remaining downloads. I still get the same error

This is a different issue. The tile for 201908 has not yet been made available on the site. The error message needs to be updated with the new url to check.

chrisvwn commented 4 years ago

Also, just wanted to confirm - the csv file named NL_DATA_IND_ADM1_GADM-3.6-SHPZIP in the data folder contains the radiance sums for each state? Would it be right to say that adding all of them would give the radiance sum for the entire country in that month?

Yes, the file contains the radiance sums for each member of the given admLevel.

Theoretically, it is right to assume that the sum of all the radiance sums at a lower level should equal the radiance sum at the next higher level. However, practically there may be a (very) small difference in sums. This may be attributed to slight variations in the polygon boundaries of the different admin levels.

pk1196 commented 4 years ago

Oh I think that's a different website, 201908 and 201909 are available here https://eogdata.mines.edu/download_dnb_composites.html

chrisvwn commented 4 years ago

I see. You are right. It seems the page structure has changed.

Try setting it to this url:

pkgOptions(ntLtsIndexUrlVIIRS.M="https://eogdata.mines.edu/pages/download_dnb_composites_iframe.html")
pk1196 commented 4 years ago

That does not seem to work unfortunately, I still get the same error message.

chrisvwn commented 4 years ago

Could you print your output? I seem to get it here

> pkgOptions(ntLtsIndexUrlVIIRS.M="https://eogdata.mines.edu/pages/download_dnb_composites_iframe.html")
> ctry <- "IND" #replace to run for any other country
> 
> #download and process monthly VIIRS stats at the highest admin level
> highestAdmLevelStats <- getCtryNlData(ctryCode = ctry,
+                                       admLevel = "highest",
+                                       nlType = "VIIRS.M",
+                                       nlPeriods = "201908",
+                                       nlStats = list("sum",na.rm=TRUE),
+                                       ignoreMissing=FALSE)
Processing missing data: IND:VIIRS.M:VCMCFG:201908:sum. This may take a while. 
Note: Set 'ignoreMissing=TRUE' to return only data found or 
'ignoreMissing=NULL' to return NULL if not all the data is found
2019-12-06 10:56:44: **** START PROCESSING: ctryCodes=IND, admLevels=list(IND = "gadm36_IND_1"), nlTypes=VIIRS.M, configNames=VCMCFG, multiTileStrategy=all, multiTileMergeFun=mean, removeGasFlares=TRUE, nlPeriods=201908, nlStats=list(list("sum", na.rm = TRUE)), custPolyPath=NULL, gadmVersion=3.6, gadmPolyType=shpZip, downloadMethod=auto, cropMaskMethod=rast, extractMethod=rast****
2019-12-06 10:56:44: Downloading country polygons ...
2019-12-06 10:56:44: Downloading polygon: IND
2019-12-06 10:56:44: Downloading ctry poly: IND
2019-12-06 10:56:44: Downloading ctry shpZip: IND
2019-12-06 10:56:44: Polygon dir for IND:3.6 already exists
2019-12-06 10:56:44: Downloading country polygons ... DONE
2019-12-06 10:56:44: **** PROCESSING nlType:VIIRS.MconfigName: VCMCFG nlPeriod:201908****
2019-12-06 10:56:44: Checking tiles required for VIIRS.M 201908
2019-12-06 10:56:44: IND: Stats missing. Adding tiles
2019-12-06 10:56:44: numTiles: 1, Required tiles: 75N060E
2019-12-06 10:56:44: Downloading tile: 2019083
trying URL 'https://eogdata.mines.edu/pages/download_dnb_composites_iframe.html'
Content type 'text/html; charset=UTF-8' length 239739 bytes (234 KB)
==================================================
downloaded 234 KB

trying URL 'https://eogdata.mines.edu/wwwdata/viirs_products/dnb_composites/v10//201908/vcmcfg/SVDNB_npp_20190801-20190831_75N060E_vcmcfg_v10_c201909051300.tgz'
Content type 'application/x-gzip' length 371652391 bytes (354.4 MB)
pk1196 commented 4 years ago

Sure, this is what I get:

pkgOptions(ntLtsIndexUrlVIIRS.M="https://eogdata.mines.edu/pages/download_dnb_composites_iframe.html")

(Optional performance enhancement if you have aria2c and gdal installed)

pkgOptions(downloadMethod = "aria", cropMaskMethod = "gdal", extractMethod = "gdal", deleteTiles = TRUE)

Optional performance enhancement. If extractMethod="rast" you can specify the number of

CPU cores to use in parallel

pkgOptions(extractMethod = "rast", numCores=4)

ctry <- "IND" #replace to run for any other country

download and process monthly VIIRS stats at the highest admin level

highestAdmLevelStats <- getCtryNlData(ctryCode = ctry,

  • admLevel = "highest",
  • nlType = "VIIRS.M",
  • nlPeriods = nlRange("201908", "201909","VIIRS.M"),
  • nlStats = list("sum",na.rm=TRUE),
  • ignoreMissing=FALSE) Processing missing data: IND:VIIRS.M:VCMCFG:201908:sum, VIIRS.M:VCMCFG:201909:sum. This may take a while. Note: Set 'ignoreMissing=TRUE' to return only data found or 'ignoreMissing=NULL' to return NULL if not all the data is found 2019-12-06 15:52:38: START PROCESSING: ctryCodes=IND, admLevels=list(IND = "gadm36_IND_1"), nlTypes=VIIRS.M, configNames=VCMCFG, multiTileStrategy=all, multiTileMergeFun=mean, removeGasFlares=TRUE, nlPeriods=c("201908", "201909"), nlStats=list(list("sum", na.rm = TRUE)), custPolyPath=NULL, gadmVersion=3.6, gadmPolyType=shpZip, downloadMethod=auto, cropMaskMethod=rast, extractMethod=rast 2019-12-06 15:52:38: Downloading country polygons ... 2019-12-06 15:52:38: Downloading polygon: IND 2019-12-06 15:52:38: Downloading ctry poly: IND 2019-12-06 15:52:38: Downloading ctry shpZip: IND 2019-12-06 15:52:38: Polygon dir for IND:3.6 already exists 2019-12-06 15:52:38: Downloading country polygons ... DONE 2019-12-06 15:52:38: PROCESSING nlType:VIIRS.MconfigName: VCMCFG nlPeriod:201908 2019-12-06 15:52:38: Checking tiles required for VIIRS.M 201908 2019-12-06 15:52:38: IND: Stats missing. Adding tiles 2019-12-06 15:52:38: numTiles: 1, Required tiles: 75N060E 2019-12-06 15:52:38: Downloading tile: 2019083 2019-12-06 15:52:38: Tile not available on the NOAA page. Please manually check for the 201908 tile at 'https://ngdc.noaa.gov/eog/viirs/download_dnb_composites.html'. If it exists please report this as a bug 2019-12-06 15:52:38: Something went wrong with the tile downloads. Aborting ... Error in getCtryNlData(ctryCode = ctry, admLevel = "highest", nlType = "VIIRS.M", : 2019-12-06 15:52:38: An error occurred
chrisvwn commented 4 years ago

Ah okay. The problem is that the index page is cached. Please run Rnightlights:::nlCleanup() to clear the temp cache.

pk1196 commented 4 years ago

Ah that solves it, thanks very much for your help!

chrisvwn commented 4 years ago

You're welcome!