pierreroudier / clhs

A R implementation of the conditioned Latin Hypercube Sampling method
12 stars 9 forks source link

NA / 0-values in cost surface cause clhs() to fail #3

Open dylanbeaudette opened 6 years ago

dylanbeaudette commented 6 years ago

Sometimes 0's or NA (?) in the cost surface result in the following error:

Error in if (delta_obj > 0 & runif(1) >= metropolis | runif(1) >= metropolis_cost) { : 
  missing value where TRUE/FALSE needed

It would appear that metropolis_cost is periodically set to NA.

Digging deeper into clhs.data.frame, I see on lines:

118

# (initial) operational cost
    op_cost <- sum(cost[i_sampled, ])

196

# op costs
      op_cost <- sum(cost[i_sampled, ])

cost[i_sampled, ]) will contain NA if there are NA in the cost surface. This is fairly common when working with masked raster data or irregular areas that are surrounded by NA.

Setting na.rm=TRUE in the call to sum() seems like a reasonable solution, but will result in clearly sub-optimal sampling locations or locations outside of the non-NA pixels. Like this:

image

The contours were generated from the cost surface.

I suspect that any samples with NA cost should result in restarting of the sampling process. There must be an efficient way for constraining samples to the non-NA portions of the sampling domain.

Incidentally, .lhs_obj() seems to deal with NA just fine.