Closed trickeydan closed 5 years ago
Please use "Marker" instead of "Token" - canonically a token is a cardboard cube that robots may move around, which may be covered with one or more (usually six) markers.
Orientation could be represented internally as a quaternion for "correctness" and maximum portability between vision implementations, and then exposed in a simpler, consistent way to competitors.
On Tue, 5 Feb 2019, 00:45 Dan Trickey <notifications@github.com wrote:
We need to find a nice way to model fiducial markers without restricting ourselves to a particular library.
My initial suggestions:
- Camera Component representing a camera
- Token - data type of a token
- Vector - The distance and direction of a token from a camera.
- Spherical coordinates
- Cartesian coordinates, we need to support this for less knowledgeable competitors
- Orientation - The orientation of a token in 3D space
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/j5api/j5/issues/47, or mute the thread https://github.com/notifications/unsubscribe-auth/AAznVNkTFXYYt_0fq3w5dJyycJX7cQSjks5vKNQjgaJpZM4aiVJE .
Awesome Quaternion library: http://kieranwynn.github.io/pyquaternion/
@HU90m determined that Quarternions aren't suitable for our use case, which is a shame from a mathematical / perfectionist perspective, but otherwise alright.
Yeah, they aren't ideal 😞. What library is @RealOrangeOne using and where can I find his backend?
Yeah, they aren't ideal disappointed. What library is @RealOrangeOne using and where can I find his backend?
Coordinates and the vision library is here.
There's no interoperability with j5
yet.
I'm not sure if it's worth using the same library, we can do a lot of the same stuff with NamedTuple
in python 3.6, and I'd rather try and keep the number of hard dependencies we have low
I like that plan. If we are using a NamedTuple based coordinate system to describe object position, we could use quaternions for object orientation.
I like that plan. If we are using a NamedTuple based coordinate system to describe object position, we could use quaternions for object orientation.
:+1:
Reminder of syntax for NamedTuples in Py3.6: https://mypy.readthedocs.io/en/latest/kinds_of_types.html#named-tuples
I found some time to make a start on the vision system last weekend and I thought I would share a quick update. Do share any thoughts.
I started with a coordinate system with a NamedTuple per marker. I replaced this with a NameTuple for a collection of markers, due to the speed improvements afforded by numpy arrays.
I don't think this system is dynamic enough. I plan to swap out the NamedTuples for a dictionary of numpy arrays (or something similar), which will allow appending and item assignment.
Small snippets of the code so far...
class CylCoords(NamedTuple):
"""Cylindrical Coordinate System.
p := axial distance
phi := azimuth angle (radians)
z := height
"""
p: np.array
phi: np.array
z: np.array
@staticmethod
def cyl_to_sph_coords(cyl: CylCoords) -> SphCoords:
"""Converts Cylindrical Coordinates to Spherical Coordinates."""
sph = SphCoords(np.sqrt(cyl.p**2 + cyl.z**2),
np.arctan2(cyl.p, cyl.z),
cyl.phi)
return sph
Thanks for getting going with this, vision is always a big one so it's great that you've started on it 😄
My immediate thoughts on reading the update were, Is there a reason for using numpy
?
I'm all for performance but I strongly believe we should measure before making premature optimisations. From my (admittedly limited) understanding of vision, most of the computationally intensive work happens while processing the image after that it's just passing around values. In our use cases, I wouldn't think we'd expect more than 10-15 markers in an image and if this is the case does the use of numpy
arrays offer a noticeable performance benefit?
Also, would these numpy
arrays ever be exposed directly to competitors and if so do they expose the same API as python lists? For people just getting into python, multiple ways of accessing different collections could be confusing. (Did try googling this one but my google-fu wasn't strong enough today)
For something as core as vision it may be worth opening a draft PR early on, as it'd allow more input along the way.
Once again, thanks for kicking this off and please don't take any of my comments as direct criticisms, they're for my own understanding more than anything.
Is there a reason for using numpy
I agree, that there's no reason to expose any numpy
fundamentals in the public API. Unless we're exposing the raw pixel data, everything else can (and should) be done using a simpler construct, such as named tuples and python lists. numpy
arrays are generally only beneficial if you have large amounts of data, or require doing complex calculations on them, neither of which we're doing here.
Unfortunately, the underlying APIs for OpenCV
(and therefore zoloto) use numpy
heavily, and in ways which are required for a fast vision system. This does benefit from the above benefits of using numpy
arrays, quite heavily.
https://github.com/RealOrangeOne/zoloto/blob/master/zoloto/marker.py defined a good reference as to what may be needed from a public API, and good implementations. 1 part relies on numpy
, but only for calculations, over using the arrays.
https://github.com/sourcebots/sb-vision/blob/master/sb_vision/coordinates.py#L43 shows a very simple implementation of converting cartesian to spherical coordinates. @HU90m I also notice your implementation uses key naming I've not come across before. Are those commonly used names the teams will understand. Previous kits have simply used rot_*
(see definitions in https://github.com/sourcebots/sb-vision/blob/master/sb_vision/coordinates.py#L27).
I plan to swap out the NamedTuples for a dictionary
Named tuples are definitely the right construct for this. What benefits do you see coming from dictionaries?
NamedTuple per marker
@HU90m What do you mean by this? We definitely still require access to individual markers, and very few attributes are shared between markers which would give benefit to a shared object. (shared != shared definition, they should definitely be the same type!)
Thank you for all the input; I agree with a lot of what has been said above.
The draft pull request is a great idea. I will try and find some time later to open one so we can more easily discuss this component.
I'm going to have a go at some stuff also, just because I'm in the mood at the moment.
The branch is up: https://github.com/j5api/j5/tree/vision
Still very rough around the edges (some superfluous code and few tests). I expect you would like to change many aspects (or even most aspects); I won't be offended. Please be ruthless.
The branch is up: https://github.com/j5api/j5/tree/vision
Still very rough around the edges (some superfluous code and few tests). I expect you would like to change many aspects (or even most aspects); I won't be offended. Please be ruthless.
Having had a quick look, there's definitely some nice ideas here and also room for improvement.
Please could you open a pull request from the branch into master
? This will let us do a proper code review :)
We need to find a nice way to model fiducial markers without restricting ourselves to a particular library.
My initial suggestions:
Camera
Component representing a cameraToken
- data type of a tokenVector
- The distance and direction of a token from a camera.Spherical
coordinatesCartesian
coordinates, we need to support this for less knowledgeable competitorsOrientation
- The orientation of a token in 3D space