femiguez / apsimx

R package for APSIM-X
https://femiguez.github.io/apsimx-docs/
45 stars 19 forks source link

get_isric_soil_profile() can't open connection to rest.isric.org #142

Closed LenLon closed 6 months ago

LenLon commented 7 months ago

Grid point 51 , lon 12.75 lat 7.75 : cannot open the connection to 'https://rest.isric.org/soilgrids/v2.0/properties/query?lon=12.75&lat=7.75&property=bdod&property=soc&property=phh2o&property=clay&property=sand&property=nitrogen&property=cec&depth=0-5cm&depth=0-30cm&depth=5-15cm&depth=15-30cm&depth=30-60cm&depth=60-100cm&depth=100-200cm&value=mean'

Is my (custom) error message when looping over a set of grid points using get_isric_soil_profile(), which I have been getting consistently since this week. Might just be a server issue on the side of ISRIC, but can anyone reproduce that or is it on my side?

femiguez commented 7 months ago

@LenLon This has happened before and it was a SoilGrid / ISRIC server issue. It is NOT working for me right now. I get the same error message

femiguez commented 6 months ago

@LenLon I just tested this and it is working again. I'll close it

LenLon commented 6 months ago

It has been working for some grid points, but for most of them I still get the same error. As this is not the fault of the R package though, we can close this issue ;<

LenLon commented 6 months ago

Interestingly, I can open many of my non-working grid point links in the browser just fine. So it might also be an issue with the query, because I am STILL getting the "cannot open connection" error for many grid points! As do others (test the link in the post, they work for me in the browser)

Some of my "broken" links though lead to this:

pastedImage

Seems all a bit muddy... but I really can't continue my work without this issue resolved :-(

I already am in contact with the ISRIC server webmaster though, so I'll keep this updated.

LenLon commented 6 months ago

This is what I got as a reply:

`Dear Lennart,

Please have a look to this link (https://www.isric.org/explore/soilgrids/faq-soilgrids#How_can_I_access_SoilGrids) with information on how to access SoilGrids with different alternative options.

For the moment, in order to maintain optimal performance of our API, our Fair Use Policy is defined as 5 API calls per 1 minute period.

Hopefully this information will be useful and help with your issues.

Best regards,

Islambek Urazov`

Does the get_isric_soil_profile() function count as an API call? Using it with a loop will easily overload that limit, but I get more than 5 soil profiles per minute for sure anyways.

femiguez commented 6 months ago

@LenLon I understand and we need to respect the limit imposed by the API. A workaround is to get all the soils in one step (respecting the 5 queries per minute) and perform the simulations in a second step. You can store the soils in a list and then add them using 'edit_*'. You should not need to be downloading soil profiles constantly.

LenLon commented 6 months ago

I do try to get all the soils in one step, but I am simulating 158 sites so I need 158 soils - I just wrote a while loop that tries and tries again until almost all soils are filled. Maybe I should download the SoilGrid data myself, but I have no idea how to turn those into a working APSIM soil profile.

femiguez commented 6 months ago

Here is an idea for a pseudo-code that will get all the soil profiles

soil.vec <- vector("list", length = 158)

soil.profile.coords <- data.frame(x = ..., y = ...)

for(i in 1:158){

   tmp <- try(get_isric_soil_profile(latlon = c(soil.profile.coords[i,], ...), silent = TRUE)

  ### If the previous fails, store the soil index and get it in a second try
  ### Otherwise, store it in a list
  soil.vec[[i]] <- tmp
}

After this, in a separate script, you can set up your simulations and replace the soil profiles where appropriate

LenLon commented 6 months ago

That's precisely what I am doing 📦

`

soil_list <- vector(mode = "list", length = length(country_grid$centroids)) # initialise list for collecting soil profiles

for (pnt in 1:length(country_grid$centroids)) { # loop over all centroids

x <- round(country_grid$centroids[pnt][[1]][1], 2) # save x / lon coordinate of centroid, round to two decimals y <- round(country_grid$centroids[pnt][[1]][2], 2) # save y / lat coordinate of centroid, round to two decimals

tryCatch( # error handling for no data available at centroid coordinates

{

  # !!! retrieve ISRIC soil profile at centroid coordinates !!!

  soil <- apsimx::get_isric_soil_profile(lonlat = c(x, y), find.location.name = T) 

  if(length(crop) != 0){  

    # Add in crop soilwat variables (same values as maize by default)

    soil$crops <- c("Maize", "Soybean", "Wheat", crop)

    for(i in crop){

      assign(paste0(i, ".KL"), soil$soil$Maize.KL)
      assign(paste0(i, ".LL"), soil$soil$Maize.LL)
      assign(paste0(i, ".XF"), soil$soil$Maize.XF)

      soil$soil <- c(soil$soil, 
                     setNames(list(get(paste0(i, ".KL"))), paste0(i, ".KL")),
                     setNames(list(get(paste0(i, ".LL"))), paste0(i, ".LL")),
                     setNames(list(get(paste0(i, ".XF"))), paste0(i, ".XF")))

    }

  }else{

    soil$crops <- c("Maize", "Soybean", "Wheat") # set soilwat crops to default

    }

  soil_list[[pnt]] <- soil  

},

error = function(e){

  cat("\nGrid point", country_grid$GridID[pnt], ", lon", x, "lat", y, ":", conditionMessage(e), "\n")

  }   

)

if(is.null(soil_list[[pnt]]) == FALSE) cat("Grid point ", country_grid$GridID[pnt], "ok - ") # something to look at while you wait

}

`

With some bonus code for adding in other crops than the three base ones, though with the same values as maize

femiguez commented 6 months ago

The current version of 'get_isric_soil_profile' in github has an option to 'fix' (set it to TRUE) the soil profile (numerical issues) and also to supply the crop names through the 'xargs' argument

LenLon commented 6 months ago

Unfortunately the one we use on the cluster is a bit older, and updating packages there from github is quite a hassle, so we'll hold off on trying to update that until we are forced to! :D

femiguez commented 6 months ago

I'm assuming you just want to run the APSIM simulations on the cluster. You could get the soils on a different computer and just move the data. Also, you would not need to update the apsimx package from github, just copy the function to the cluster and source it.

LenLon commented 6 months ago

That tip for the functions is really nice!

But yeah that's how I am working at the moment, creating the soil profiles locally and just uploading the list as an .Rdata file

BrianCollinss commented 5 months ago

I understand and we need to respect the limit imposed by the API. A workaround is to get all the soils in one step (respecting the 5 queries per minute) and perform the simulations in a second step. You can store the soils in a list and then add them using 'edit_*'. You should not need to be downloading soil profiles constantly.

Hey @femiguez. Do you recommend we run a script to download the soil, but keep trying until all soils are downloaded? For example, I can run a script that runs indefinitely (like what you proposed a couple of comments ago) until all soils are downloaded. Is this what you suggest? Ta

BrianCollinss commented 5 months ago

BTW, I keep getting HTTP status was '504 Gateway Timeout' and I am certain it is not related to that 5-call-per-minute limit.