What should the detector output?

There are three options for the detector output.

1) One can be the mask regions only, or effectively the segmentation.

2) Another can be some kind of partially processed outputs that consist of masked regions and bounding boxes. Sort of like a region proposal type of output. The data could even be extracted, but not yet stored into puzzle pieces.

3) The last can be fully processed output, which is an actual puzzle board instance.

Currently, the detector output leans towards the mask regions only. The perceiver should pass along the image and the mask to the tracker, which will do the rest. There can be good arguments for the other two, especially the last one.

The last one would permit all possible puzzle board instances to exist. The detected puzzle board without association (possibly including connected puzzle piece regions and partially occluded), the measured or associated puzzle pieces based on the current individually separated and recognized puzzle pieces (in the trackpointer, which also has the feature vector information), and then the estimated puzzle state which factors in connected pieces and occluded pieces. There would be a fourth puzzle board, which is the solution/calibration board.

Doing the third option would require some non-trivial revisions to the code stubs.

What's the most appealing option? Thoughts?

In the first case, the trackpointer would consist of the detected/measured puzzle pieces, plus the association. That means it would have two puzzle board instances.

One part that has not been thought out is the occurrence of puzzle clusters. Should they be kept or not? If association is perfect, then the perceiver would already know about the connected puzzle pieces and there would be no need to confirm anything. However, if the entire cluster moved, then it would be a problem.

Impose constraint: Can we force the user to only build/connect pieces when doing so will put them in the correct part of the puzzle region? No building out somewhere else, then moving into place? It would solve the problem of knowing where things moved. Otherwise, we'd need a way to also match connected regions over time if they moved. These would require different processing. Might be good to impose this for now in the spirit of simplifying the problem, then later see if we can extend to this case. It should be easy to upgrade if the underlying class structure is correct.

ivapylibs / puzzle_solver

What should the detector output? #6