Create Vision Components

trickeydan commented 5 years ago

We need to find a nice way to model fiducial markers without restricting ourselves to a particular library.

My initial suggestions:

Camera Component representing a camera
Token - data type of a token
Vector - The distance and direction of a token from a camera.
- Spherical coordinates
- Cartesian coordinates, we need to support this for less knowledgeable competitors
Orientation - The orientation of a token in 3D space

kierdavis commented 5 years ago

Please use "Marker" instead of "Token" - canonically a token is a cardboard cube that robots may move around, which may be covered with one or more (usually six) markers.

Orientation could be represented internally as a quaternion for "correctness" and maximum portability between vision implementations, and then exposed in a simpler, consistent way to competitors.

On Tue, 5 Feb 2019, 00:45 Dan Trickey <notifications@github.com wrote:

We need to find a nice way to model fiducial markers without restricting ourselves to a particular library.

My initial suggestions:

Camera Component representing a camera

Token - data type of a token

Vector - The distance and direction of a token from a camera.

Spherical coordinates

Cartesian coordinates, we need to support this for less knowledgeable competitors

Orientation - The orientation of a token in 3D space

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/j5api/j5/issues/47, or mute the thread https://github.com/notifications/unsubscribe-auth/AAznVNkTFXYYt_0fq3w5dJyycJX7cQSjks5vKNQjgaJpZM4aiVJE .

trickeydan commented 5 years ago

Awesome Quaternion library: http://kieranwynn.github.io/pyquaternion/

trickeydan commented 5 years ago

@HU90m determined that Quarternions aren't suitable for our use case, which is a shame from a mathematical / perfectionist perspective, but otherwise alright.

HU90m commented 5 years ago

Yeah, they aren't ideal 😞. What library is @RealOrangeOne using and where can I find his backend?

trickeydan commented 5 years ago

Yeah, they aren't ideal disappointed. What library is @RealOrangeOne using and where can I find his backend?

Coordinates and the vision library is here.

There's no interoperability with j5 yet.

I'm not sure if it's worth using the same library, we can do a lot of the same stuff with NamedTuple in python 3.6, and I'd rather try and keep the number of hard dependencies we have low

trickeydan commented 5 years ago

RobotAPI
- Camera
- Marker
sr.robot
- Vision

HU90m commented 5 years ago

I like that plan. If we are using a NamedTuple based coordinate system to describe object position, we could use quaternions for object orientation.

trickeydan commented 5 years ago

I like that plan. If we are using a NamedTuple based coordinate system to describe object position, we could use quaternions for object orientation.

:+1:

Reminder of syntax for NamedTuples in Py3.6: https://mypy.readthedocs.io/en/latest/kinds_of_types.html#named-tuples

HU90m commented 5 years ago

I found some time to make a start on the vision system last weekend and I thought I would share a quick update. Do share any thoughts.

I started with a coordinate system with a NamedTuple per marker. I replaced this with a NameTuple for a collection of markers, due to the speed improvements afforded by numpy arrays.

I don't think this system is dynamic enough. I plan to swap out the NamedTuples for a dictionary of numpy arrays (or something similar), which will allow appending and item assignment.

Small snippets of the code so far...

class CylCoords(NamedTuple):
    """Cylindrical Coordinate System.

    p   := axial distance
    phi := azimuth angle (radians)
    z   := height
    """

    p: np.array
    phi: np.array
    z: np.array

    @staticmethod
    def cyl_to_sph_coords(cyl: CylCoords) -> SphCoords:
        """Converts Cylindrical Coordinates to Spherical Coordinates."""
        sph = SphCoords(np.sqrt(cyl.p**2 + cyl.z**2),
                        np.arctan2(cyl.p, cyl.z),
                        cyl.phi)
        return sph

sedders123 commented 5 years ago

Thanks for getting going with this, vision is always a big one so it's great that you've started on it 😄

My immediate thoughts on reading the update were, Is there a reason for using numpy?

I'm all for performance but I strongly believe we should measure before making premature optimisations. From my (admittedly limited) understanding of vision, most of the computationally intensive work happens while processing the image after that it's just passing around values. In our use cases, I wouldn't think we'd expect more than 10-15 markers in an image and if this is the case does the use of numpy arrays offer a noticeable performance benefit?

Also, would these numpy arrays ever be exposed directly to competitors and if so do they expose the same API as python lists? For people just getting into python, multiple ways of accessing different collections could be confusing. (Did try googling this one but my google-fu wasn't strong enough today)

For something as core as vision it may be worth opening a draft PR early on, as it'd allow more input along the way.

Once again, thanks for kicking this off and please don't take any of my comments as direct criticisms, they're for my own understanding more than anything.

RealOrangeOne commented 5 years ago

Is there a reason for using numpy

I agree, that there's no reason to expose any numpy fundamentals in the public API. Unless we're exposing the raw pixel data, everything else can (and should) be done using a simpler construct, such as named tuples and python lists. numpy arrays are generally only beneficial if you have large amounts of data, or require doing complex calculations on them, neither of which we're doing here.

Unfortunately, the underlying APIs for OpenCV (and therefore zoloto) use numpy heavily, and in ways which are required for a fast vision system. This does benefit from the above benefits of using numpy arrays, quite heavily.

https://github.com/RealOrangeOne/zoloto/blob/master/zoloto/marker.py defined a good reference as to what may be needed from a public API, and good implementations. 1 part relies on numpy, but only for calculations, over using the arrays.

https://github.com/sourcebots/sb-vision/blob/master/sb_vision/coordinates.py#L43 shows a very simple implementation of converting cartesian to spherical coordinates. @HU90m I also notice your implementation uses key naming I've not come across before. Are those commonly used names the teams will understand. Previous kits have simply used rot_* (see definitions in https://github.com/sourcebots/sb-vision/blob/master/sb_vision/coordinates.py#L27).

I plan to swap out the NamedTuples for a dictionary

Named tuples are definitely the right construct for this. What benefits do you see coming from dictionaries?

NamedTuple per marker

@HU90m What do you mean by this? We definitely still require access to individual markers, and very few attributes are shared between markers which would give benefit to a shared object. (shared != shared definition, they should definitely be the same type!)

HU90m commented 5 years ago

Thank you for all the input; I agree with a lot of what has been said above.

The draft pull request is a great idea. I will try and find some time later to open one so we can more easily discuss this component.

trickeydan commented 5 years ago

I'm going to have a go at some stuff also, just because I'm in the mood at the moment.

HU90m commented 5 years ago

The branch is up: https://github.com/j5api/j5/tree/vision

Still very rough around the edges (some superfluous code and few tests). I expect you would like to change many aspects (or even most aspects); I won't be offended. Please be ruthless.

trickeydan commented 5 years ago

The branch is up: https://github.com/j5api/j5/tree/vision

Still very rough around the edges (some superfluous code and few tests). I expect you would like to change many aspects (or even most aspects); I won't be offended. Please be ruthless.

Having had a quick look, there's definitely some nice ideas here and also room for improvement.

Please could you open a pull request from the branch into master? This will let us do a proper code review :)

srobo / j5

Create Vision Components #47