MRPT / GSoC2017-discussions

See README
1 stars 0 forks source link

Robust SLAM and localization method using artificial fiducial markers and stereo vision (Shivang Agrawal) #5

Closed jlblancoc closed 7 years ago

jlblancoc commented 7 years ago

Initial description:

The current methods of visual SLAM in mrpt uses computationally expensive feature detectors and descriptors like SIFT, SURF etc. The method, although flexible, is not well suited for real time sceinarios. To provide mrpt with real time SLAM and localization capabilities, the use of artificial fiducial markers is proposed that can be detected and matched easily in different frames. Hence allowing us to compute the 3-dimensional pose of the marker points in real-time as well to allow the the robust localization of the camera frames.

See attached PDF: 5261167632056320_1491155987_Projectproposal.pdf

jlblancoc commented 7 years ago

Student: @shivangag

Main mentor: @feroze Backup mentors: @bergercookie

shivangag commented 7 years ago

Thank you @jlblancoc, @feroze and @bergercookie for selecting my proposal. It will be a great learning experience for me working with you all. I'll start with reading the EKF-SLAM based classes and the CLandmarksMap class, and learn how can they be used in the project, by this week. These were the parts that were not discussed properly in the proposal, Therefore I'll like to discuss and remove any uncertainty pertaining to this part.

jlblancoc commented 7 years ago

Hi @shivangag !

This is a general note to all accepted students that will be working on a fork of MRPT/mrpt : please, use the branch https://github.com/MRPT/mrpt/tree/mrpt-2.0-devel as base for all your work. In other works: don't start working based on MRPT/master, but on MRPT/mrpt-2.0-devel. This is because it's expected that during your project, master will become a maintenance branch mrpt-1.5, while mrpt-2.0-devel eventually will be merged with master and become the main development branch.

bergercookie commented 7 years ago

Hi @shivangag,

Again congratulations on being accepted in this year's GSoC. I hope this proves an exciting experience!

'll start with reading the EKF-SLAM based classes and the CLandmarksMap class, and learn how can they be used in the project, by this week. These were the parts that were not discussed properly in the proposal, Therefore I'll like to discuss and remove any uncertainty pertaining to this part.

Excellent! Just keep in mind, to make your code/classes as generic as possible. Since the end-product of your work is going to provide a highly accurate estimate of (at least) the robot trajectory, this can be utilized as ground-truth, or as a correction step to another SLAM framework which may not necessarily be EKF-SLAM (see applications like graphslam-engine, rbpf-slam or icp-slam ).

It would also be useful to useful to update the project goals, expected results, and corresponding timeline as you go along (e.g. Google docs would be ideal for this)

Let me know if you have any questions with regards to MRPT (build system, APIs etc.) or if there's is anything particular with regards to your project that you think we should discuss.

With regards, Nikos

shivangag commented 7 years ago

Hi @bergercookie I had a thorough read of the CLandmarksMap and related classes like CLandmark, CFeature and different types defined in types.h file. Currently the CFeature class stores the descriptors of a feature in a TDescriptors structure, which has variables for storing SIFT, SURF, ORB etc. Clearly, we cannot use it with other descriptors available or even for the fiducial markers. I was thinking of creating a new class CDescriptor that have templatized variables, to store descriptors, and functions, to compare two descriptors of similar types. In the CFeature class, we can then use a vector of CDescriptor to store the descriptors of a feature point.

Also, @jlblancoc mentioned in my google doc proposal that the CLandmarksMap class is outdated and needs cleanup and refactoring, any pointers on which part is no longer used or can be updated?

Regards,

bergercookie commented 7 years ago

Hi @shivangag

I had a thorough read of the CLandmarksMap and related classes like CLandmark, CFeature and different types defined in types.h file. Currently the CFeature class stores the descriptors of a feature in a TDescriptors structure, which has variables for storing SIFT, SURF, ORB etc. Clearly, we cannot use it with other descriptors available or even for the fiducial markers. I was thinking of creating a new class CDescriptor that have templatized variables, to store descriptors, and functions, to compare two descriptors of similar types.

This comes down to the specific feature points of a marker instance. I am not entirely up-to-date on how one might implement this, perhaps @feroze could shed some light on this. However, there are open-source examples of marker detection like the Aruco lib, or its OpenCV implementation. Since OpenCV is already an dependency of MRPT we could consider wrapping their implementation instead of coding a new one, at least for the Aruco markers.

For now I would concentrate on implementing a minimal working example of the problem at hand:

The latter should be fed the initial image(s), after converting them to an MRPT-compatible format like CObservationImage, or or CObservation3DRangeScan, and should output results on the detection procedure (vector of detected markers, corresponding 3D poses, covariances etc.)

Also, @jlblancoc mentioned in my google doc proposal that the CLandmarksMap class is outdated and needs cleanup and refactoring, any pointers on which part is no longer used or can be updated

No, I wasn't aware of this. In any case, if you feel like something is missing from that particular class or something needs reformatting go ahead and do it.

Cheers, Nikos

jolting commented 7 years ago

Aruco was added to OpenCV 3.1.0.

It's unclear when OpenCV 3.1.0 will be packages for debian stable, but it looks like there is some progress on debian experimental[1]. This means it might be a while before Ubuntu will have it.

It's probably best to just use the Aruco lib for feature parity across all the platforms. OpenCV 2.4 is still more common than the 3.x branch.

  1. https://tracker.debian.org/pkg/opencv
shivangag commented 7 years ago

@bergercookie Am I supposed to pull the latest mrpt-2.0-devel commits? Because it will require a complete rebuild.

bergercookie commented 7 years ago

@bergercookie https://github.com/bergercookie Am I supposed to pull the latest mrpt-2.0-devel commits? Because it will require a complete rebuild.

@shivangag, yes. You should be operating entirely on mrpt-2.0-devel. I'd also suggest that you periodically fetch and sync the latest changes from the mrpt-2.0-devel branch so that the final merging of your code goes as smooth as it can be.

Nikos,

On Wed, May 31, 2017 at 5:25 PM Shivang Agrawal notifications@github.com wrote:

@bergercookie https://github.com/bergercookie Am I supposed to pull the latest mrpt-2.0-devel commits? Because it will require a complete rebuild.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/MRPT/GSoC2017-discussions/issues/5#issuecomment-305203018, or mute the thread https://github.com/notifications/unsubscribe-auth/AFjBj99wlHxDq-f2sejn6WGjTlw0w0V7ks5r_XhPgaJpZM4NRWZG .

-- Nikos Koukis School of Mechanical Engineering National Technical University of Athens nickkouk@gmail.com +30 6985827375

bergercookie commented 7 years ago

Hi @shivangag,

We are now well in the first phase of GSoC. How is the coding going? I haven't seen any recent activity in your MRPT fork in either gsoc or mrpt-2.0-devel branches. Am I looking at the wrong repo/branch?

In case you have made progress but keep your commits locally, I think it would be better to push your changes to github so me and @feroze can review them and provide you with feedback.

Cheers, Nikos

On Wed, May 31, 2017 at 5:34 PM Nikolaos Koukis nickkouk@gmail.com wrote:

@bergercookie https://github.com/bergercookie Am I supposed to pull the latest mrpt-2.0-devel commits? Because it will require a complete rebuild.

@shivangag, yes. You should be operating entirely on mrpt-2.0-devel. I'd also suggest that you periodically fetch and sync the latest changes from the mrpt-2.0-devel branch so that the final merging of your code goes as smooth as it can be.

Nikos,

On Wed, May 31, 2017 at 5:25 PM Shivang Agrawal notifications@github.com wrote:

@bergercookie https://github.com/bergercookie Am I supposed to pull the latest mrpt-2.0-devel commits? Because it will require a complete rebuild.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/MRPT/GSoC2017-discussions/issues/5#issuecomment-305203018, or mute the thread https://github.com/notifications/unsubscribe-auth/AFjBj99wlHxDq-f2sejn6WGjTlw0w0V7ks5r_XhPgaJpZM4NRWZG .

-- Nikos Koukis School of Mechanical Engineering National Technical University of Athens nickkouk@gmail.com +30 6985827375 <+30%20698%20582%207375>

-- Nikos Koukis School of Mechanical Engineering National Technical University of Athens nickkouk@gmail.com +30 6985827375

shivangag commented 7 years ago

Hey @bergercookie I was traveling for last few days and therefore was not able to work. I'll be working full time on the project from now on and would try to catch up to the timeline. And by the way, I'm working on markerdetection branch.

feroze commented 7 years ago

Hey @shivangag

Don't feel shy about pushing small code changes or 'work in progress' commits to your branch daily. It can be cleaned up and squashed into polished commits later. This will let us understand your progress and quickly help with any doubts or design choices.

shivangag commented 7 years ago

@feroze and @bergercookie I have added the Aruco detection classes here. I was writing a test application to check the added code, but I am stuck at how can I convert the images to CObservationImage. The documentation at http://reference.mrpt.org/devel/ seems not to be generated properly.

bergercookie commented 7 years ago

Hi @shivangag,

I took a look in your recent commits and made some comments inline.

Finally, keep in mind that you should also integrate the potential use of the aruco library as an optional MRPT CMake dependency, that is, if the users do have that library installed, the corresponding code lines are going to be compiled, otherwise they are going to be skipped. The aruco library also provides a Findaruco.cmake script which you can use to detect whether aruco is available in the system or not.

The documentation at http://reference.mrpt.org/devel/ seems not to be generated properly.

In case mrpt.org docs don't work, try building them locally. Use the scripts/build_docs script. The latter requires various packages (most notably doxygen) to be installed in the system.

# Generate html documentation
scripts/build_docs.sh -h

Just be careful not to upload the generated files as they are generated in the mrpt source directory.

Cheers, Nikos

jlblancoc commented 7 years ago

Also, @jlblancoc mentioned in my google doc proposal that the CLandmarksMap class is outdated and needs cleanup and refactoring, any pointers on which part is no longer used or can be updated?

The "problem" with CLandmarksMap is that it was designed with one particular paradigm for SLAM and localization in mind: particle filters (or RBPF for mapping). That was "trending" in the 2000s and, in fact, is a great solution and the unique rigorous solution for some kind of sensors (i.e. raw 2D scans) and maps (i.e. occupancy grid maps).

However, RBPF are not (speaking in general) an optimal solution for maps of discrete elements (landmarks, graphs of pose constraints), that's why I said I couldn't see much usefulness in updating CLandmarksMap to handle this kind of observations (markers).

Instead, it's more natural to approach localization or mapping with these sensors with frameworks like EKF-SLAM or graph-SLAM.

I think I already gave you pointers to the Range-bearing EKF-SLAM source code in MRPT. Creating a new class, "strongly inspired" by those ones, for markers-based localization and mapping, would be the ideal approach for a project as yours, limited in time as it is.

It should be accompanied by a new stand-alone application. See apps/kf-slam as an example very similar to what you should try to do.

The point is: despite what one might expect, CLandmarksMap has nothing to do with EKF-SLAM classes even if they also represent maps of discrete landmarks (!!).

shivangag commented 7 years ago

I have pushed some new commits here with changes suggested by @bergercookie and @jlblancoc. I also created a crude external test file (I couldn't figure out how to link aruco libs with the test file in mrpt/samples directory) that reads images from the disk, converts it into CObservationImage object and then call CArucoMarkerDetection functions to output the CDetectableMarker vector. It seems to have some issues with images containing tags of small sizes, or cluttered tags.

jlblancoc commented 7 years ago

Well done!

The use of the pimpl macros seems correct.

I don't know why you should link an example file, which only uses the new mrpt class that wraps aruco, against aruco... only the mrpt lib should be linked against it. That's the advantage of dynamic linked libs (.so file).

You could add a new example to mrpt a explained here: https://github.com/MRPT/mrpt/blob/master/samples/HOW_TO_ADD_EXAMPLES.txt adding the detectors lib as a dependence.

shivangag commented 7 years ago

Hey @jlblancoc, Thanks for your review. I missed linking the detector module with aruco and hence was getting undefined reference error before. I have added the script_aruco.cmake file and a Findaruco.cmake file(It was generated by the aruco library, but only supports Linux, I'll add support for other platforms later) and linked the detectors module. And finally, I have added the markerTest sample file.

jlblancoc commented 7 years ago

Hi, Good progress!

But I don't feel comfortable with a pair of details... I'll explain why and I'm sure you'll agree:

bergercookie commented 7 years ago

Hi @shivangag,

You are making progress :-)

All the points raised by Jose are valid; MRPT is a pretty large project and there are some of the fundamental rules to maintain it.

For now, I'd concentrate on creating a fully fledged example that demonstrate the capabilities of aruco marker tracking. Ideally the end-product by the end of this coding period should be an example / application (would prefer the second) that:

  1. Take a video file as input.
  2. Detect aruco markers in that video and keep track of them. Since each one has a unique ID it should be fairly straightforward to get the transforamtion of each marker in consecutive frames. Using the latter, and assuming that the person holding the camera is moving, while the aruco markers are static, we can compute a fairly accurate approximation of the trajectory of the camera in space.
  3. Add visualization capabilities to your application by integrating it with either wxWidgets or Qt. From your standpoint the calls to the gui function that you'll have to make will most likely be the same with either framework. In the latter you can visualize the estimated positions of the markers, the camera trajectory, a viewport with the camera feed, whatever else you think might help the users.

Remarks

P.S. I realize that the given time may not suffice to implement all these. Focus on having a functional application, add unit tests, add some docs and deal with as many of the aforementioned points as possible.

Cheers, Nikos

shivangag commented 7 years ago

@jlblancoc I have updated the script_aruco.cmake file to use pkg-config and find modules to find aruco.

@bergercookie Thanks for the pointers. I'll am working with the unit tests, and will follow it by pose estimation from single markers.

jlblancoc commented 7 years ago

This is a kind reminder to all GSoC students and mentors

According to GSoC FAQ, students are supposed to devote 30+ hours per week to their assigned project. This amounts to 6h/day (Mon-Fri) or 4.3h/day (Mon-Sun), or any equivalent distribution given the flexibility to work on your own schedule. Clearly failing to such a commitment may be a reason to fail in the monthly evaluations, since that information was clear at the time of applying to GSoC.

It's difficult to convert hours into GitHub commits, but as a rule of dumb, there should at least a substantial daily commit during each week working day, that is, roughly 5 substantial commits per week. This is a general rule, and mentors may be flexible depending on the difficulty and nature of each project.

The first evaluation will start June 26-30.

feroze commented 7 years ago

Hey @shivangag ,

Any progress over the weekend? Could you push the commits?

We have less than a week for the first evaluation and there is quite a lot of work to catch up on.

jlblancoc commented 7 years ago

For those of you creating new MRPT apps, remember to also create a basic manpage, a must-have for Debian packages. This includes:

shivangag commented 7 years ago

Hey @bergercookie and @feroze, I have almost completed the aruco integration, with the exception being the unit test which I'll add soon. For the application, I've made myself comfortable with the GUI APIs and have started working with it. I have some doubts on calculating the frame poses from the markers. Here is the crude algorithm that I thought to start with

  1. Set the first frame pose as the global origin and calculate the relative pose of the markers in the frame.
  2. Get the next frame and calculate the relative pose of the markers in this frame. Now suppose we have multiple markers that were visible in both the current frame and previous frame, we can get the pose of the current frame from each of the markers. How should the pose of the frame be calculated? Should the mean of the poses be calculated and a Gaussian distribution be taken around it?

I found a paper on the topic using some sophisticated algorithms with an impressible result. Should I try to implement it here?

shivangag commented 7 years ago

Hey @bergercookie and @feroze,

Here is the screenshot of the starting phase of the GUI. It projects the markers in the space using their pose as returned by the detection module. gui

Next Steps

  1. Using a set of images, create a map of the markers and estimate frame poses.
  2. Replace images with a video stream and use keyframes obtained at a predefined fps rate.

For now, I'll be calculating the pose of the frame by taking the mean of the poses obtained each marker. The trajectory is expected to drift with time due to error accumulation, but the application would be enough to show the capabilities of markers for mapping.

bergercookie commented 7 years ago

Hi @shivangag,

I took a look at what you have uploaded so far and I tried to compile and run the markerTest sample. While it compiled successfully, running it with the share/mrpt/datasets/markers/marker-on-cardboard.jpg file didn't produce any output. I suppose this is not the desired behavior. Am I running it correctly? Is there some other file I should be using instead?

Also the first evaluation of GSoC starts today, so please wrap up and upload the rest of your work so that we can review it.

shivangag commented 7 years ago

Hey @bergercookie, The problem is with the tag_family in samples/markerTest/MARKER_TEST.INI please change it from ARUCO_MIP_36h12 to ARUCO. I was working with different tag families and might have changed the configuration file, which somehow crept into the remote repo.

shivangag commented 7 years ago

@bergercookie I have pushed the unit test file and an updated sample file. I'm using a test image that is provided by aruco with camera parameters. Though, I have not yet pushed those. I am still debugging the application to obtain correct camera pose. Essentially, I am using the markers already added to map and visible in current frame to calculate the frame pose as PoseFrame = MarkerPose - MarkerPoseRelativetoFrame and then get the global pose of new markers as MarkerPose = PoseFrame + MarkerPoseRelativetoFrame. It doesn't seem to even return a good approximation of new camera pose. Any pointers on this?

feroze commented 7 years ago

@shivangag, could you upload a short video dataset (or images series) with the aruco markers at different frames at mrpt/share/mrpt/datasets/markers/? There is only a single image marker-on-cardboard.jpg which we cant really use to test out the example code. If you have a video you captured, could you push that and also update camera parameters for that video?

The example should process the frames of the video individually and extract the pose. Right now, there is simply one test image in the repo.

Also, for your CMarkerDetection unit test at https://github.com/shivangag/mrpt/blob/markerdetection/libs/detectors/src/CMarkerDetection_unittest.cpp, you use the image at mrpt/share/mrpt/datasets/markers/. This isn't committed to the repo.

We'll look into the pose computations and try to figure out why camera pose is off. Even if the calculations are off, please develop the code to run the example fully as @bergercookie had explained above.

feroze commented 7 years ago

For running the sample markerTest, mrpt/share/mrpt/datasets/markers/CAMERA_PARAM.INI" is missing. mrpt::utils::CStringList::loadFromFile fails the file exists assertion and program aborts. Could you please push that?

jlblancoc commented 7 years ago

Now suppose we have multiple markers that were visible in both the current frame and previous frame, we can get the pose of the current frame from each of the markers. How should the pose of the frame be calculated? Should the mean of the poses be calculated and a Gaussian distribution be taken around it?

Nope!! Please, don't do this :-)

In theory, this is a full SE(3) optimization problem on its own. Are you familiar with graph-SLAM, edge constraints of various types, etc.? First, first: what kind of metric information does the detection of a marker carries? Do you have a full 3D pose (x,y,z,yaw,pitch,roll) from one single observation of a marker with a monocular camera? Depending on what we have, we'll decide how to proceed with a proper theoretical framework...

Cheers

feroze commented 7 years ago

Given the camera parameters and marker configuration, Aruco should output the pose (XYZ and RPY) from a monocular image frame.

shivangag commented 7 years ago

@jlblancoc Yes, as already told by @feroze, given camera parameters and marker size, aruco outputs full 6D pose of the markers with respect to the current frame.

jlblancoc commented 7 years ago

Then, this becomes a problem of graph-slam, with observations being SE(3) transformations between the robot and the "landmarks", whose pose is also unknown.

It's really relevant that this critical discussion, which is at the core of the project, was not raised until now, one month after its beginning (!!). This reveals a clear problem with the planning of the project...

shivangag commented 7 years ago

@jlblancoc, As given in my proposal, The project was to use stereo camera observation and bundle-adjust the marker points in last n frames to get an online SLAM solution. But later, as suggested by @bergercookie, I thought it would be better to provide both monocular and stereo solutions in the application.

shivangag commented 7 years ago

I had a thorough read of the graphslam module and this is how I've thought to add support for maker observations.

  1. Node Registration: A new node registration decider class that adds a new node when the robot pose has changed above a threshold.
  2. Edge Registration: A new edge registration decider class that will add edges between frames with common markers using the measurement constraints. As we'll not have motion constraints this will be the only edge constraints present in the graph.

The graph CNetworkofPoses3DInf should be right for our purpose. Now as the optimizers are indifferent to how the graph was created, the same optimizers already defined for range scan measurements should suffice for this case too. One doubt I have is, Are the landmark nodes added to the graph in the current implementation using range scan? I will start working on it from today. For the current marker detection code, I'm working on the datasets provided by aruco. I'll upload them soon once I recover an accurate pose estimation from small enough image resolution. Currently, an image file of resolution 3264x2448 provide quite an accurate result so, now I am decimation them to get an optimal size.

shivangag commented 7 years ago

Hey @bergercookie and @feroze, I have fixed the problem with relative pose estimation between two frames, and as you had guessed it gives pretty accurate poses for the camera frames. I have created a marker-slam application that takes images as input and creates a map of markers and displays the frames poses with it in a CDisplayWindow3D. I have uploaded the dataset on the google drive, it contains the images and camera_param.ini file. Here is the screenshot of the window image I have also added a test dataset for the unit test to run upon.

bergercookie commented 7 years ago

Hi @shivangag

Obviously this is not the way we had planned for this project to end. Other than that I hope that you enjoyed the collaboration and learned some stuff both in C++ as well as in vision, and SLAM.

I also hope that our feedback as mentors, as well as the GSoC experience overall, motivates you to further learn and improve as you go along.

P.S. We are going to be closing this issue, however if you would like to further discuss anything relating to this project do let me know, either in this issue or directly via email.

With regards, Nikos