OARS-SAFS / resources

Open and Reproducible Science Resources
17 stars 0 forks source link

Move R Package: Earth Mover's Distance #7

Closed mfisher5 closed 4 years ago

mfisher5 commented 4 years ago

Does anyone have experience working with the earth mover's distance to quantify spatial overlap, or is really into spatial data and would be willing to troubleshoot some weird results?

I'm using the emd function in the move R package and have the R code running, but the distance measure isn't matching the map patterns... (example below, with earth mover's distance as EMD). emd_example_ordered

I'm wondering if it has something to do with how I'm using utilization distributions as the input for emd?

## calculate 90% utilization distributions from spatial points
dat1_ud <- kernelUD(dat1_sp, grid=250, h=5000)
dat1_ud90 <- getverticeshr(dat1_ud, percent=90)

dat2_ud <- kernelUD(dat2_sp, grid=250, h=5000)
dat2_ud90 <- getverticeshr(dat2_ud, percent=90)

## convert utilization distribution into spatial points object
extractCoords <- function(sp.df){
    results <- list()
    for(i in 1:length(sp.df@polygons[[1]]@Polygons)){
    results[[i]] <- sp.df@polygons[[1]]@Polygons[[i]]@coords
    }
    results <- Reduce(rbind, results); return(results)
}

dat1 <- extractCoords(ud90_base); colnames(dat1) <- c("lon","lat")
dat1_sp <- SpatialPoints(dat1,proj4string=CRS("+init=epsg:32610 +proj=utm +zone=10 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0"))

dat2 <- extractCoords(ud90_hab); colnames(dat2) <- c("lon","lat")
dat2_sp <- SpatialPoints(dat2,proj4string=CRS("+init=epsg:32610 +proj=utm +zone=10 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0"))

## calculate earth mover's distance
tmp_overlap <- move::emd(dat2_sp,dat1_sp, gc=T)
github-actions[bot] commented 4 years ago

Thanks so much for posting your first issue in this repo!

mfisher5 commented 4 years ago

For anyone interested, the solution (from @briana-abrahms):

## create spatial points data frame
dat1_spdf <- SpatialPointsDataFrame(coords=group1_coords, data=group1_ids, proj4string = CRS("+init=epsg:4326"))
dat2_spdf <- SpatialPointsDataFrame(coords=group2_coords, data=group2_ids, proj4string = CRS("+init=epsg:4326"))

## generate utilization distributions
dat1_ud <- kernelUD(dat1_sp, grid=250, h=5000)
dat2_ud <- kernelUD(dat2_sp, grid=250, h=5000)

## convert to raster
ras1 <- raster(as(ud1,"SpatialPixelsDataFrame"))
ras2 <- raster(as(ud2,"SpatialPixelsDataFrame"))

## get 90% utilization distribution in raster object
ras1[ras1 >.90] <- 0; newras1 <- ras1/cellStats(ras1,sum)
ras2[ras2 >.90] <- 0; newras2 <- ras2/cellStats(ras2,sum)

## calculate Earth Mover's Distance
emd(newras1, newras2, gc=T)