Closed william-silversmith closed 4 years ago
On more careful evaluation with some YACCLAB datasets (medical, fingerprint, tobbaco800), the block based approach is not substantially better and in some cases worse. I'm not sure what I'm doing wrong. This isn't as sophisticated as the optimized approach, but I would have expected to see some improvement.
Things got more interesting again. If you try the worst case for SAUF vs a BBDT (a field of ones), we see substantial improvement.
5000x5000 field of ones: bbdt_4_4: 282.75 MVx/sec bbdt_2_2: 224.36 MVx/sec sauf: 222.78 MVx/sec
Slightly more formal testing. Fingerprint and Medical are datasets from YACCLAB. Unity is a 5k x 5k field of ones.
Algorithm | Dataset | N | MVx/sec | Factor |
---|---|---|---|---|
SAUF | Fingerprint | 100 | 228 | 1 |
SAUF | Medical | 15 | 233 | 1 |
SAUF | Unity (5k x 5k) | 200 | 217 | 1 |
SAUF | Random | 5 | 97 | 1 |
BBDT 2x1 | Fingerprint | 100 | 242 | 1.06 |
BBDT 2x1 | Medical | 15 | 236 | 1.01 |
BBDT 2x1 | Unity (5k x 5k) | 200 | 215 | 0.99 |
BBDT 2x1 | Random | 5 | 100 | 1.03 |
BBDT 2x2 | Fingerprint | 100 | 256 | 1.12 |
BBDT 2x2 | Medical | 15 | 250 | 1.07 |
BBDT 2x2 | Unity (5k x 5k) | 200 | 278 | 1.28 |
BBDT 2x2 | Random | 5 | 105 | 1.08 |
Well I guess that seals it. The bbdt 2x2 is pretty good!
This compares to the previous (awful) strategy of using the 6-connected logic.
Algorithm | Dataset | Sec. | MVx/sec | Factor |
---|---|---|---|---|
6-SAUF | Fingerprints | 21.5 | 232.6 | 1 |
6-SAUF | Medical | 30.0 | 198.9 | 1 |
6-SAUF | Unity | 26.3 | 181.2 | 1 |
6-SAUF | Random Array | 12.3 | 97.2 | 1 |
4-BBDT 2x2 | Fingerprints | 20.6 | 243.0 | 1.04 |
4-BBDT 2x2 | Medical | 24.0 | 248.1 | 1.25 |
4-BBDT 2x2 | Unity | 17.3 | 276.0 | 1.52 |
4-BBDT 2x2 | Random Array | 12.2 | 98.1 | 1.01 |
Raw data:
<function fingerprints at 0x7ff82b4312f0> connectivity=4 N=100
20.610318183899928 243.0 (1.04x)
<function medical at 0x7ff8289aef28> connectivity=4 N=15
24.01355004310708 248.1 (1.25x)
<function unity at 0x7ff82b431268> connectivity=4 N=200
17.2783682346354 276.0 (1.52x)
<function random_array at 0x7ff82b4311e0> connectivity=4 N=5
12.153060674668358 98.1 (1.01x)
<function fingerprints at 0x7ff82b4312f0> connectivity=6 N=100
21.536387205124903 232.6
<function medical at 0x7ff8289aef28> connectivity=6 N=15
29.95720171928506 198.9
<function unity at 0x7ff82b431268> connectivity=6 N=200
26.312411785126734 181.2 (
<function random_array at 0x7ff82b4311e0> connectivity=6 N=5
12.258985757828759 97.2
I'm not sure if there's something in the literature for this since everyone I've read seems to be focused on 8-connected, but here we try applying techniques such as the Grana et al 2009 paper from the 8 connected problem to the 4 connected problem.
The 4-connected problem using SAUF looks like this:
B checks A and steals it if a match, then does unify(B,C). If A is missing, B steals C if a match, otherwise new label. You can flip this to get a slightly different decision tree with unify(A,B) instead.
This means that there are a tremendous number of unify(B,C) occurring as a solid block of foreground will require a unify on nearly every pixel. Can we use the BBDT technique to reduce unifies? Yes! Even though we can't exploit the intrinsic 2x2 connectedness of the 8-connected block, the 4-connected block offers something.
Let's start with the simplest version of the expanded problem.
The key here is that using the original tree starting with B, you have to do unify(B,D) if B matches D. However, because D is intrinsically connected to E, that means that if C matches B, we can skip unify(C,E). This means we can skip about half the unifies across the image. However, we can also apply this logic in the vertical direction as well.
Here we start at A instead of B. If A matches F, we can skip unify(C,E) if A matches C. We can also skip unify(B,H) if A matches G. We can also skip unify(D,B) if A and C match D.
I haven't perfected the 2x2 version yet, but I have seen an improvement on the 2x1 version from about 210 MVx/sec - 215 MVx/sec to about 213 - 224 MVx/sec (~1-4% better) on a medical image dataset from YACCLAB. Hopefully the perfected 2x2 version will be even better.
Notably, this implementation of BBDT is multi-label competent!