educelab / volume-cartographer

Volumetric processing toolkit and C++ libraries for the recovery and restoration of damaged cultural materials
GNU General Public License v3.0
63 stars 21 forks source link

Row duplicates in ordered point sets #24

Closed csparker247 closed 1 year ago

csparker247 commented 1 year ago

Users are reporting that VC ordered point sets have duplicate rows for a given slice number.

First thought is that this is caused by pointset merging, after editing, but it could be caused by the algorithms themselves.

csparker247 commented 1 year ago

From kglspl's original Discord messages [1, 2]:

# Scroll1.volpkg/paths/20230509182749/pointset.vcps
Z value 400.0 has 3578 entries    # z=399 missing?
Z value 401.0 has 1789 entries
Z value 402.0 has 1789 entries
Z value 403.0 has 1789 entries
Z value 404.0 has 1789 entries
Z value 405.0 has 1789 entries
Z value 406.0 has 1789 entries
Z value 407.0 has 1789 entries
Z value 408.0 has 1789 entries    # z=409 missing
Z value 410.0 has 3578 entries
Z value 411.0 has 1789 entries
Z value 412.0 has 1789 entries
Z value 413.0 has 1789 entries
Z value 414.0 has 1789 entries
Z value 415.0 has 1789 entries
Z value 416.0 has 1789 entries
Z value 417.0 has 1789 entries
Z value 418.0 has 1789 entries
Z value 419.0 has 1789 entries
Z value 420.0 has 1789 entries
Z value 421.0 has 1789 entries
Z value 422.0 has 1789 entries
Z value 423.0 has 1789 entries
Z value 424.0 has 1789 entries
Z value 425.0 has 1789 entries
Z value 426.0 has 1789 entries
Z value 427.0 has 1789 entries
Z value 428.0 has 1789 entries    # z=429 missing
Z value 430.0 has 3578 entries
Z value 431.0 has 1789 entries
Z value 432.0 has 1789 entries
Z value 433.0 has 1789 entries
Z value 434.0 has 1789 entries
Z value 435.0 has 1789 entries
Z value 436.0 has 1789 entries
Z value 437.0 has 1789 entries
Z value 438.0 has 1789 entries
Z value 439.0 has 1789 entries
Z value 440.0 has 1789 entries
Z value 441.0 has 1789 entries
Z value 442.0 has 1789 entries
Z value 443.0 has 1789 entries
Z value 444.0 has 1789 entries
Z value 445.0 has 1789 entries
Z value 446.0 has 1789 entries
Z value 447.0 has 1789 entries
Z value 448.0 has 1789 entries
Z value 449.0 has 1789 entries
Z value 450.0 has 1789 entries
Z value 451.0 has 1789 entries
Z value 452.0 has 1789 entries
Z value 453.0 has 1789 entries
Z value 454.0 has 1789 entries
Z value 455.0 has 1789 entries
Z value 456.0 has 1789 entries
Z value 457.0 has 1789 entries
Z value 458.0 has 1789 entries    # z=459 missing
Z value 460.0 has 3578 entries
Z value 461.0 has 1789 entries
Z value 462.0 has 1789 entries
Z value 463.0 has 1789 entries
Z value 464.0 has 1789 entries
Z value 465.0 has 1789 entries
Z value 466.0 has 1789 entries

From RICHI:

this is a bug from the segmentation algorithms. i already found that one a little while ago. it will be fixed in a V2 algorithm of mine.

csparker247 commented 1 year ago

It currently looks like the issue is that the starting z-index gets used as the z-value for the first newly generated row (which should actually be z+1):

[0] 400     # starting row
[1] 400     # what?
[2] 401
[3] 402
[4] 403
[5] 404
[6] 405
[7] 406
[8] 407
[9] 408     # should be 409
[10] 410    # new starting index
[11] 410    # what?

However, this isn't always the case. For example, what is happening here?

[207] 606
[208] 607
[209] 608    # should be 609
[210] 610    # new starting index
[211] 610    # 611
[212] 611    # 612
...
[249] 648    # 649
[250] 649    # 650, and the next is a duplicate, but we aren't missing in the last row...
[251] 649    # 651
[252] 650    # 652
[253] 651    # 653
[254] 652    # 654
[255] 653    # 655
[256] 654    # 656
[257] 655    # 657
[258] 656    # 658
[259] 657    # 659, now we're skipping two values to "catch up"
[260] 660    # 660 
[261] 660    # 661
[262] 661    # 662

I feel like the UI is using "logical" z-values when splitting the point sets, but the algorithms might be using the actual z-values, so there are compounding discrepancies? But something doesn't quite fit about that explanation, and I can't put my finger on it.

Anyway, it's also true that the first two rows aren't always wrong. For example, here's an old segmentation from M910:

# 20180712221634
[0] 456    # fine
[1] 457
[2] 458
[3] 459
[4] 460
...
[459] 915
[460] 916
[461] 917
[462] 917    # first error in the point set
[463] 919
[464] 920
[465] 921
[466] 922
...
[485] 941
[486] 942
[487] 943
[488] 943    # whoops
[489] 945
[490] 946

This doesn't look like the start of the segment is wrong. This looks like the end of the segment is wrong.