andrew-plowright / ForestTools

Detect and segment individual tree from remotely sensed data
69 stars 22 forks source link

Retain Treetop ID #13

Closed azh2 closed 3 years ago

azh2 commented 3 years ago

I am running this package on multiple cores using a very large dataset and the trouble is that it is difficult to merge the chunks back together. I have wrapped the mcws function into a catalog_apply function using the lidR package and now I am having a hard time merging the delineated crowns back together since the same ID occurs multiple times, it would be great if the crown ID corresponded to the Treetop ID then this task would be a lot easier to apply in a parallelized environment. I am trying to correct this by having large overlapping areas that I can merge back together later, but this is not ideal.

andrew-plowright commented 3 years ago

Hi @azh2 , I typically use these tools in similar parallel way. The polygons actually should match the treetops IDs. Can you show me a sample of your code?

azh2 commented 3 years ago

Hi Andrew, thank you, you're right - I was mistaken.

Given this fact, the easiest way I've found to run segmentation across a large area is to first generate the Tree-Tops for the entire area, and then within the script, I subset the treetops to the raster chunk.

First I mosaic the CHMs:

raslist <- list.files(paste0(outdir, "/CHMGAUSS"), full.names = TRUE, pattern = "*_CHM_GAUSS.tif")
raslist <- lapply(raslist, raster::raster)
names(raslist)[1:2] <- c('x', 'y')
raslist$fun <- mean
raslist$na.rm <- TRUE
Z <- do.call(raster::mosaic, raslist)

And then run the tree-top detection algorithm (in this case I am using the lmf algorithm in the lidR package)

tt <- lidR::find_trees(Z, alg)

Now I am applying the mcws algorithm individually to each raster chunk in a parralelized environment:

raslist <- list.files(paste0(outdir, "/CHMGAUSS"), full.names = TRUE, pattern = "*_CHM_GAUSS.tif")

cl <- parallel::makeCluster(availableCores())

# Activate cluster for foreach library
doParallel::registerDoParallel(cl)

dir.create(path = paste0(outdir,"/WSCROWNS"))

foreach::foreach(i = 1:length(raslist)) %dopar% {
ras <- raster::raster(file.path(paste0(raslist[i])))
ttops <- raster::intersect(x = tt, y = ras)
wtrshd <- ForestTools::mcws(treetops = ttops, CHM = ras, minHeight = 1, format = 'raster')
#(obj = wtrshd, dsn = paste0(outdir, "/WSCROWNS"), layer = paste0(outdir,"/CHMGAUSS/", gsub("[^0-9]", "", raslist[i])), driver = 'ESRI Shapefile')
raster::writeRaster(wtrshd, filename = paste0(outdir,"/WSCROWNS/", gsub("[^0-9]", "", raslist[i]),"_WS_CROWNS"), format = "GTiff", overwrite=TRUE)
}

The only downfall to this method is that it loads the entire tree-top dataset onto each core and uses a lot of RAM so I've been trying to figure out an alternative to this, but so far, it hasn't presented any issues.

andrew-plowright commented 3 years ago

Great, glad you were able to figure that out. You're correct that mcws does indeed load the entire tree-top dataset. Since these are just points, you should be able to load quite a few of them without overloading your RAM.

If you're interested, mcws is mostly a fancy wrapper for the imager::watershed function, which is implemented in a different library. You could experiment with that and see if you can come up with a more efficient solution.