Nowosad / supercells

The goal of supercells is to utilize the concept of superpixels to a variety of spatial data.
https://jakubnowosad.com/supercells/
GNU General Public License v3.0
66 stars 5 forks source link

Cleaning connectivity step fails according to minarea size #34

Closed spono closed 8 months ago

spono commented 8 months ago

Hi, first of all: thanks a lot for this great package (and the others) that I'm discovering only now! I think they'll fill a big gap in R.

Running the following lines:

vol_slic = supercells::supercells( r, step = 20, compactness = 2, dist_fun = "euclidean", avg_fun="median",
                                  clean = TRUE
                                  , minarea=900
                                  , iter = 10
                                  , verbose=1)

I'm having a bit of troubles with the Cleaning connectivity step which seems to fail when the minarea parameter is used.

I made few tests getting to this:

Error: I cannot return supercells. This may be due to a large number of missing values in the 'x' object. Try to either trim your data to the non-NA area (e.g., with 'terra::trim()') or increase the number of expected supercells.

It seems strange because all no-data values were first convert to 0 before running the function.

Am I missing something?

Here the file I'm using for testing test_supercells.zip

Nowosad commented 8 months ago

Hi @spono!

Thanks for opening the issue. In short, you use minarea = 900, which means that superpixels smaller that 900 will be removed (merged with a larger ones, to be more precise). However, in you case, you also specify step = 20, which means that initial centers of superpixels will be 20 cells apart. As the result, majority of created superpixels are smaller than 900 cells and thus the code does not work (see the attached code below). Let me know if you have any suggestions of changes here.

library(terra)
#> terra 1.7.69
library(supercells)

r = rast("CHM_1m.tif")
r
#> class       : SpatRaster 
#> dimensions  : 416, 509, 1  (nrow, ncol, nlyr)
#> resolution  : 1, 1  (x, y)
#> extent      : 691296, 691805, 5077224, 5077640  (xmin, xmax, ymin, ymax)
#> coord. ref. : WGS 84 / UTM zone 32N (EPSG:32632) 
#> source      : CHM_1m.tif 
#> name        :      Z 
#> min value   :  0.001 
#> max value   : 41.305
plot(r)

vol_slic = supercells::supercells(r, step = 20, compactness = 2, dist_fun = "euclidean",
                                  avg_fun = "median", clean = TRUE, minarea = 900, iter = 10,
                                  verbose = 1)
#> Step: 20
#> Initialization: Completed
#> Iteration: 1/10Iteration: 2/10Iteration: 3/10Iteration: 4/10Iteration: 5/10Iteration: 6/10Iteration: 7/10Iteration: 8/10Iteration: 9/10Iteration: 10/10
#> Cleaning connectivity: Completed
#> Error: I cannot return supercells. This may be due to a large number of missing values in the 'x' object. Try to either trim your data to the non-NA area (e.g., with 'terra::trim()') or increase the number of expected supercells.
vol_slic2 = supercells::supercells(r, step = 20, compactness = 2, dist_fun = "euclidean",
                                  avg_fun = "median", clean = FALSE, iter = 10,
                                  verbose = 1)

    #> Step: 20
    #> Initialization: Completed
    #> Iteration: 1/10Iteration: 2/10Iteration: 3/10Iteration: 4/10Iteration: 5/10Iteration: 6/10Iteration: 7/10Iteration: 8/10Iteration: 9/10Iteration: 10/10

    r = extract(r, vol_slic2)
    superpixels_sizes = sort(table(r$ID))
    superpixels_sizes
    #> 
    #>   1   2 296   4 402  12  10 470 353 162 356 227 321   5  59   3 346  11   9  67 
    #>  96  97 104 108 113 136 137 140 153 165 166 173 177 181 183 186 195 198 211 211 
    #> 367 101 304 460  41 233 154 370 223  33 211 441 244 489 300 383   6 161 503 328 
    #> 213 219 222 222 223 224 230 231 235 237 237 239 240 244 245 245 247 250 252 253 
    #>  36 111 112 301 237 347 354  14 478 173 204 455 116   7 445 400 198 369 379 128 
    #> 255 255 256 256 258 260 261 262 262 263 263 264 269 272 273 275 277 278 278 279 
    #>  44 219 405 420 365  69 466 446 450 226  92 392 258 429 397 342 453 350  91 451 
    #> 280 284 284 286 288 289 289 291 291 294 296 296 297 297 298 299 301 303 305 305 
    #> 262  62 448 129 216 261  18 305 373 147 136 343 440 278  54  95 362 426 313 272 
    #> 307 308 308 309 309 310 311 311 311 312 313 313 313 314 315 316 316 316 318 319 
    #>  17  45 270 181 327 138 323  22 443  93 485 175 210 351 439 349 436  96 201 477 
    #> 321 321 321 322 322 323 323 324 324 326 326 327 327 328 328 330 330 333 335 335 
    #> 306 387 182 318 143 303 374 386 239 331 377  72 215 447  77 487 248 366  61 235 
    #> 336 336 337 337 338 338 338 338 341 341 341 342 343 343 344 344 345 345 347 347 
    #> 238 260 414 133 196   8 384 412 398 454  23 177  29 106 292 380 207 150 188 407 
    #> 347 347 347 348 348 349 349 349 350 350 351 352 353 353 353 353 354 355 355 355 
    #>  87 316  37 287 411 425 482 423  46 197 271 322 452  53 110 148 246 348 480 236 
    #> 356 357 359 360 360 360 360 361 363 363 364 365 365 366 366 366 366 366 367 368 
    #> 419 220 360 169 195 461 396  71 256 189 488  43 141 168 212 208 444  51 395 249 
    #> 368 369 369 371 371 372 373 374 374 375 375 378 378 378 378 379 379 380 380 381 
    #> 274 372 476 358 190 291 378  49  94 245 462 224 385 491  30 424  15 125 338 413 
    #> 381 381 381 382 383 383 383 384 384 384 384 385 385 385 386 386 387 388 388 388 
    #> 449 403 492  28 134 268 364 471  83 113 457 483 166 332 309 421  42 130 100 252 
    #> 388 390 390 391 391 391 392 392 393 393 393 393 394 394 395 395 396 396 397 397 
    #> 334 151 228 326 382  21 267 458  63  76 142 203  98 280 363 371 145 266 297 155 
    #> 397 398 398 398 398 400 400 400 401 401 401 401 402 402 402 402 403 403 403 404 
    #> 344 107 194  13 137 225 247 312 118 355 417 495 438 102 250 185 265 299 340 473 
    #> 404 405 405 406 406 406 406 406 407 407 407 407 408 409 410 411 411 411 411 411 
    #> 119 167 368 279 284 105 391  24 123 264  50 293 320 415  75 178 180 202 221  57 
    #> 412 412 412 413 413 414 414 415 415 415 416 416 416 417 419 419 419 419 420 421 
    #> 242 263  68 179 393 469  38  90 222 481 259 275 335 406  16  19 254 285 314  81 
    #> 421 421 423 423 423 424 425 425 426 426 427 427 427 427 428 428 428 428 429 431 
    #> 135 253 257 310 375 251 269 103 475 410 114 337 388 205 288 330 325  35 126 214 
    #> 431 431 431 431 431 432 432 433 433 434 436 436 436 437 437 437 438 441 441 441 
    #> 117 192 442 186 394 172 191 206 352  64 170 295 357  48  65 122 232 329 336 474 
    #> 442 442 442 443 443 444 444 444 445 446 447 447 449 450 450 450 451 451 451 451 
    #>  82  84 499 104 115  55  88 231 317 401 501  73  27 158 183 484 505 146 311 361 
    #> 452 452 452 454 455 456 456 456 456 456 456 457 459 459 459 460 460 461 461 461 
    #>  39 381 408 479 193 409  40 290 144 157 213 243 431 496 281 289 124 187 467  60 
    #> 462 462 462 463 464 464 465 465 466 466 466 466 466 466 468 468 470 470 470 471 
    #> 390  89 163 422  70 153 286 399 437  97 217 241 121 493 127 184  25 490 240 298 
    #> 471 472 472 473 475 475 475 475 476 477 478 479 480 481 482 484 486 486 487 487 
    #> 504  66 108  26  78 230 432 502 463 174  34 132  79  85 282 120 200  32  31 389 
    #> 488 489 489 490 490 490 491 492 494 495 496 496 497 497 499 502 502 503 504 504 
    #> 434  52 468 139 152 359 427 465 277  74 140 131 464 494 171 176 273 319 255  56 
    #> 504 505 506 507 508 509 510 511 512 514 518 519 519 520 521 522 522 522 524 526 
    #> 433  58 229 435 472  86 276 341 283 333 209 456 149 109 376 156 416 302 459 315 
    #> 526 527 529 529 529 532 532 532 533 533 536 536 537 539 539 546 550 553 554 557 
    #> 486 308 165 418 345 294 500 234  80  47  99 164 339 404 307 430 159 506 428 160 
    #> 557 559 567 570 571 573 576 577 588 592 592 594 594 596 600 604 609 616 626 627 
    #> 497 199  20 324 498 218 
    #> 656 667 682 744 757 813
    hist(superpixels_sizes)

Created on 2025-03-08 with reprex v2.0.2

spono commented 8 months ago

ok, makes definitely sense. I don't know if it may help but probably a warning like:

if( !is.null(minarea) & minarea > step^2 ){ warning("minarea is bigger than average supercell (step^2): algo will fail"}

may avoid further useless questions like mine :)

Nowosad commented 8 months ago

Thanks!