Open jeffbl opened 2 years ago
As @jeffbl mentioned, what haptics team want to present is "how it appears." So, bounding boxes may not provide somewhat detailed shape or tactile graphics; just indicating "here the item is." If contour of object, or OBB (oriented bounding box) can be detectable by pp these would be useful for haptic rendering. Of course, considering haptic rendering's resolution, it is okay to have a "rough" contour as well. (smoothening may work.) (Saliency detection inside the segment would be helpful?) -- @gp1702 whenever you are available we may chat shortly to clarify this.
@dreamseed87 Is this still valid as-is, or has our thinking evolved since this was logged last year? Should we pull this from backlog as part of IGNITE, and if so, can we get some more specific examples to implement against?
Oh, totally slipped in my head... I believe the tacton approach we made partially covers this topic. (With a nice design of tactons from @johnnyvenom.) Anyhow, good to pull some elements of this as part of IGNITE. As specific examples, we may categorize this problem threefold:
Possible solutions would be:
Expected level of difficulty would be 3 (easiest) <= 1 << 2 (hardest), from my thought.
Some questions: What do you mean by "oriented bounding box" in solution 1? For solution 3, are you proposing some documentation of some kind? A link to "how to interpret this" like we have currently with audio?
For 1: see the example picture in https://stackoverflow.com/questions/40404031/drawing-bounding-box-for-a-rotated-object , or search images with "oriented bounding box." It means bounding box with lotation (not axis-aligned). For 3, it's really an open question, but I believe several points we can improve. For example, 1) we may add some more tactons to cover more items (definitely it should be identifiable). 2), probably it is the most important, in @johnnyvenom 's implementation we used alias between tacton and actual detected object (e.g., "earth" mapped with a circle tacton, "traffic lights" mapped with a hourglass tacton). Audio description for this alias may help the users.
Right, but 2 is likely to be the hardest to provide more detail for, as we're bound by the limitations of ML. I say we work on either 1 or 3 as it's looking like we have a much leaner ML team this time around.
Yes, that's what I am saying as well :) 2 is overkill for IGNITE. 1 or 3 should be fine.
Based on discussion with @dreamseed87 @johnnyvenom @florian-grond @cyan.
For a device like the Dot pad, we can provide raised dots in rectangles representing different objects, and indicate what they are via audio, or perhaps the braille line at the bottom of the dot pad. This work item is to figure out what other information could be provided within each bounding box, to provide more detail. For example:
Examples:
We can imagine many practical difficulties here with either approach, but wanted to understand the likely difficulty of this task. The current thinking is that the basic bounding box approach is one level haptics can implement with, but having more detail/shape within the bounding boxes would take things further.
Note that semantic segmentation should already be returning a structured outline/contour once #130 is completed, but as of now, it only operates on photographs classified as "outdoor".
Assignign to @dreamseed87 for clarification/more details based on conversation, and @gp1702 for evaluation as a new preprocessor enhancement work item.