raulmur / ORB_SLAM2

Real-Time SLAM for Monocular, Stereo and RGB-D Cameras, with Loop Detection and Relocalization Capabilities
9.37k stars 4.69k forks source link

Emscripten Port #264

Open Introvertuous opened 7 years ago

Introvertuous commented 7 years ago

I was wondering if there is any interest in a javascript port of this library through emscripten, or if that is something you guys have possibly looked into.

AlejandroSilvestri commented 7 years ago

Well, what's the point of that?

It easier to write code like this on MatLab, but it must be in C++ because C and C++ compile to very efficient code. Javascript is the opposite, it is very slow. ORB-SLAM2 on javascript could process 30 frames per year :)

Introvertuous commented 7 years ago

I suppose javascript does not necessarily have to be the compile target: http://webassembly.org/

AlejandroSilvestri commented 7 years ago

@Introvertuous , thank you, now I see what you meant.

It would be great to create a repo devoted to this goal.

Right now I prefer to know if that compiler can compile C++ with STL and opencv.

I believe emscripten can't use dependencies, you have to add all dependencies in one big project (including opencv, for example). I don't know if that is even possible.

AlejandroSilvestri commented 6 years ago

@Introvertuous, opencv now has opencv.js


One step closer to port ORB-SLAM2 to webassembly :)

lukedukeus commented 3 years ago

@Introvertuous Any update / progress on this?

AlejandroSilvestri commented 3 years ago

@lukedukeus ,

As far as I know, noone is working on this.

nickw1 commented 2 years ago

Experimenting with getting ORB_SLAM3 working on emscripten.

See https://github.com/nickw1/ORB_SLAM3, this compiles on emscripten (the library, but not the examples which depend on opencv features not available in opencv.js)

Have started experimenting with creating an application using it, see https://github.com/nickw1/orb-slam-expts/

... it compiles, but loading the vocabulary is very slow, as in, more than 10 minutes (at which point I gave up), Maybe the file I/O in emscripten is too slow?

Would be interested if anyone has any hints for getting this working.

AlejandroSilvestri commented 2 years ago

@nickw1, that's great!

There are many binary implementation for vocabulary: voc file is smaller, and loads 100x faster.

I don't know any standalone project on binary voc, but many orb-slam2 forks implemented their own very similar binary voc.

My own is here, look for loadFromBinaryFile method to see how it is implemented on orb-slam2, and the binary voc file:

nickw1 commented 2 years ago

@AlejandroSilvestri thanks! As it happens that is precisely what I have been working on this morning, having discovered this late yesterday :-)

I found this one: https://github.com/surfii3z/ORB_SLAM3. Looks like this was originally done for ORB_SLAM2 and was ported to ORB_SLAM3.

Thanks for letting me know of your own implementation.

nickw1 commented 2 years ago

@AlejandroSilvestri to follow up... with the binary file the vocabulary now loads successfully in 2 seconds on emscripten. Haven't tested whether tracking works yet. Thanks again!

AlejandroSilvestri commented 2 years ago


I am very interested in following up on your project. Please continue posting here or tell me where.

nickw1 commented 2 years ago

@AlejandroSilvestri thanks. The github repos are: https://github.com/nickw1/orb-slam-expts/ (experiment to get ORB-SLAM working with emscripten)


https://github.com/nickw1/ORB_SLAM3 (my fork, the binvoc branch is the one to look at). If I have to update ORB_SLAM itself to get it to work, I will make commits here.

For both repos, I will keep a log of changes made, and experiments done, in the README.

nickw1 commented 2 years ago

Sorry, typo in first repo, note it's https://github.com/nickw1/orb-slam-expts

nickw1 commented 2 years ago

Making some progress on this. However, as soon as reconstruction with two views occurs, I get an unaligned memory access error. Emscripten can't handle unaligned memory access: is this something that ORB-SLAM3 is doing? @AlejandroSilvestri any ideas? Thanks.

Unfortunately the debugging from Emscripten can't pinpoint the actual line, so this will require more digging... but just wondered whether this was a known issue in ORB-SLAM3.

nickw1 commented 2 years ago

Output from log:

*** CreateInitialMapMonocular() orb_wasm.js:2695:8
First KF:0; Map init KF:0 orb_wasm.js:2695:8
New Map created with 55 points orb_wasm.js:2695:8
Point distribution in KeyFrame: left-> 55 --- right-> 0 orb_wasm.js:2695:8
RuntimeError: unaligned memory access
Squareys commented 2 years ago

@nickw1 WebAssembly does not allow misaligned memory read/writes -- Well, thanks, duh.

Here's the problem: If you have a type bigger than one byte (e.g. char), like 2-bytes (short, unsigned short) or 4-bytes (int, float) etc, you will need to align them to memory addresses that are multiple of that type's size. That means a short may only be read from or written from to even addresses, int only at multiples or 4.

Most memory allocated with new or malloc will be 32-bit aligned anyway, but the following cases will lead to issues:

struct MisalignedA {
    char something;
    short notAlignedProperly;
    char something;
    int alignedProperly; // it's address will be the address of the struct instance + 4 bytes, which will again be a multiple of 4 bytes

struct MisalignedB {
    short alignedProperly;
    char something;

MisalignedB array[2];
array[1].alignedProperly = 0u; // Error! This is misaligned in element 0, since the address will be array + 3 byes, which is a non-even address!

You can make use of alignas for the second case, for the first, it's best practice to sort the attributes by size, largest first.

Good luck with the rest of your porting efforts, it would be absolutely amazing to have SLAM accessible in the browser!

nickw1 commented 2 years ago

@Squareys thanks for your explanation... as it happened I had read up on this some months ago so was aware of the nature of misalignment, just wondering if any of the developers could give some hints on where it would be happening.

Moving on to a more general question, not sure if anyone can give any input on this. I am now able to test this on a real mobile device; before some recent updates, something was causing Chrome to crash (for all Emscripten/WASM code) when remote debugging was attempted. Now this has been fixed.

My main problem though is trying to initialise the tracking, it seems to be proving very difficult to find two key frames with enough corresponding points. I have tried translating and rotating the mobile device in a room with plenty of recognisable 'features', and good lighting but little success. I've used the camera parameters from this ORB-SLAM2 on Android project. Any tips from anyone?

Also (regarding the feasibility of mobile tracking) does anyone have any comment on ORB-SLAM3 vs ORB-SLAM2 for this? Wondering whether to try ORB-SLAM2, particularly given that @Martin20150405 has got a native Android app working with ORB-SLAM2.

Sorry if that's a lot of questions but it would be good to hear any tips from people who might have tried running ORB-SLAM on mobile devices with the live camera feed already. Thanks!

AlejandroSilvestri commented 2 years ago

Hi @nickw1

Glad to hear you are making progress!

Camera calibration

You can't initialize the map because you need your own camera parameters, both intrinsics matrix and distortion coefficients. OpenCV tutorial has many camera calibration codes, in C++ and Python, sadly they don't have one in JS.


One of my students did an online interactive camera calibration with opencv.js, sadly it's not online, you must install it on your own server to use it:


Aligned memory

C++ aligns memory by default. It would be great to pinpoint the exact line of code that is causing that missalignment error when tracking starts.

It would be greater if you post the steps to get your orb-slam3 compiled to wasm with emscripten. For example, it helps to track if your compiler is using at least c++11 and automatically aligning objects. (I believe it is.)

nickw1 commented 2 years ago

Hi @AlejandroSilvestri thanks for your hints - the links you posted with calibration will help a great deal. Quite happy with converting C++ or Python code to JS but your student's work looks particularly interesting - will take a look.

Will try and put some more comprehensive instructions on how I'm compiling orb-slam3 but basically it's with an emscripten from last year so it should be using recent C++ standards.

nickw1 commented 2 years ago

To follow up on this, my modified orb-slam3, which builds with emscripten, is available at https://github.com/nickw1/ORB_SLAM3/tree/binvoc. Specifically it's the binvoc branch.

nickw1 commented 2 years ago

Bit of an update on this.

Unfortunately I have hit a bit of a brick wall with tracking, have tried both the calibrator suggested above by @AlejandroSilvestri and also the official Android OpenCV calibrator sample to try and get my device's parameters, but still no joy initialising tracking. It would be great if anyone had any hints on this.

In the meantime though I have tried to get ORB_SLAM3 working on Android, see https://github.com/UZ-SLAMLab/ORB_SLAM3/issues/77.

carlodek commented 2 years ago

Hi @nickw1, first of all thanks a lot for your work. My personal challenge is to port ORBSLAM3 library into a wasm and let library works with all devices just using an url. My environment is Mac m1 so maybe some tricks below are not useful for the most. Personally i got this situation:

  1. Library is build.
  2. To build wasm i have to build firstly with emmake, then i got an error as calibration.cpp.o (from opencv) can't compile with pthreads.
  3. anyway, starting an em++ really complex cmd passing the anyway generated orb_wasm.cpp.o i got my wasm built. I've achieved the build and i got a good speed to have a trackingState == 2 on pc and on android too. Problem is that, respect to native c++ library, when library losts trackingState(so ==1) is not fast to re-find points from atlas. Server is build using kotlin and jetbrain's ktor server (using a fake https to use SharedArrayBuffer). Now, have you some suggestions to speed up RE-recognition?
nickw1 commented 2 years ago

@carlodek sounds good! Unfortunately I don't have any suggestions to speed-up re-recognition: I am not one of the authors of ORB-SLAM, it would be better to ask one of them (e.g. @AlejandroSilvestri).

I am just attempting to get it working in WASM/Emscripten and native Android environments.

My current problem is still to achieve tracking on a mobile device, I have tried different calibrators to get camera parameters but I still have no luck.

Which calibrator did you use to get the parameters? Did you have any problems here?

(I also have the memory alignment issue to solve - see above - did you have this problem? But that will probably be easier to solve once I get tracking working).

carlodek commented 2 years ago

Hi @nickw1 , I think native android development so with System.loadNativeLibrary() could be easier. Maybe the best way to do it (I'm telling this cause I'm an android developer) is to extract your image using CameraX or deprecated android.os.Camera and pass it to library (theoretically :D).

@AlejandroSilvestri Hi, Do you have any suggestion for my problem as Nick suggested me?

nickw1 commented 2 years ago

Hi @carlodek thanks. Thanks for your build options, will try out those as mine were slightly different. What about your camera parameters (camera intrinsics, as defined in the .yaml files)?

I am also experimenting with ORB-SLAM3 on Android, see https://github.com/nickw1/orb-slam3-android-expts. My approach has been slightly different: I have used the OpenCV Android SDK to create a Kotlin front end, loaded each camera frame into a Mat, and sent the address of that to the native back-end.

I have got it building and running but getting a strange crash on a method call, perhaps due to memory corruption of some kind causing the stack to be corrupted? (I am not sure, I have done extensive Android development but I am new to the NDK). Do you have any ideas on that? (This isn't really the place to discuss this though, it's best to move to https://github.com/UZ-SLAMLab/ORB_SLAM3/issues/77).

carlodek commented 2 years ago

Hi @nickw1 , I really don't remember if i've changed yaml file. Anyway, here it is: %YAML:1.0


Camera Parameters. Adjust them!



Camera.type: "PinHole"

Parameters from ptam, possibly from Thorsten

mgvvCameraParams[0] = 1.59328; - for focal length

mgvvCameraParams[1] = 2.11149;

mgvvCameraParams[2] = 0.512158; - for center

mgvvCameraParams[3] = 0.436717;

mgvvCameraParams[4] = 0.961982;

# #

Calculations from ptam

// First: Focal length and image center in pixel coordinates

mvFocal[0] = mvImageSize[0] * mgvvCameraParams[0];

mvFocal[1] = mvImageSize[1] * mgvvCameraParams[1];

mvCenter[0] = mvImageSize[0] * mgvvCameraParams[2] - 0.5;

mvCenter[1] = mvImageSize[1] * mgvvCameraParams[3] - 0.5;

Camera calibration and distortion parameters (OpenCV)

Used calculations above with ptam parameters to work them out

Camera.fx: 458.654

Camera.fx: 1029.69920

Camera.fx: 1034.0

Camera.fy: 457.296

Camera.fy: 1013.51520

Camera.fy: 1034.0

Camera.cx: 367.215

Camera.cx: 327.281120

Camera.cx : 640.0

Camera.cy: 248.375

Camera.cy: 209.124160

Camera.cy : 384.0

Distortion parameters. Not sure about these, are not specified in ptam in the same way, leave for now

Camera.k1: -0.28340811

Camera.k2: 0.07395907

Camera.p1: 0.00019359

Camera.p2: 1.76187114e-05

Camera.k1: 0.3164 Camera.k2: -1.928 Camera.p1: 0.0 Camera.p2: 0.0 Camera.k3: 3.909

Camera.width: 640

Camera.height: 480

Camera.width: 640 Camera.height: 480

Camera frames per second

Camera.fps: 20.0

Camera.fps: 10.0

Color order of the images (0: BGR, 1: RGB. It is ignored if images are grayscale)

Camera.RGB: 1


ORB Parameters


ORB Extractor: Number of features per image

ORBextractor.nFeatures: 1000

ORB Extractor: Scale factor between levels in the scale pyramid

ORBextractor.scaleFactor: 1.2

ORB Extractor: Number of levels in the scale pyramid

ORBextractor.nLevels: 8

ORB Extractor: Fast threshold

Image is divided in a grid. At each cell FAST are extracted imposing a minimum response.

Firstly we impose iniThFAST. If no corners are detected we impose a lower value minThFAST

You can lower these values if your images have low contrast

ORBextractor.iniThFAST: 20 ORBextractor.minThFAST: 7


Viewer Parameters


Viewer.KeyFrameSize: 0.05 Viewer.KeyFrameLineWidth: 1 Viewer.GraphLineWidth: 0.9 Viewer.PointSize: 2 Viewer.CameraSize: 0.08 Viewer.CameraLineWidth: 3 Viewer.ViewpointX: 0 Viewer.ViewpointY: -0.7 Viewer.ViewpointZ: -1.8 Viewer.ViewpointF: 500

About android porting problem, which method cause the crash? And secondly, do you got the same crash on a real android device?

nickw1 commented 2 years ago

Hi @carlodek ok thanks, yes these look like mine unchanged.

The crash is on a real device. The stack trace is here: https://github.com/nickw1/orb-slam3-android-expts/blob/master/crashes.txt

It occurs when the MonocularInitialization() method is called from Tracking::Track().

However I cannot step into the MonocularInitialization() method, the crash seems to occur literally at the point the method is called. This is run from a secondary thread, not sure if that gives any clues?

carlodek commented 2 years ago

Hi @nickw1 , what about running MonocularInitialization() into the UiThread? I don't know for sure, but it looks like Async is not ended or something similar. I don't know how android reacts to c++ thread.join() but it should work. Maybe you can pass it there if you don't have it. Anyway it looks like c interpreter is quite different from devices and android v. Maybe pass more code to native will solve the problem. I think the best way to achieve the goal is like I've said before, use native android to get image and video and pass to orbslam "just" to elaborate the image.

nickw1 commented 2 years ago

@carlodek ok thanks for that, you could be right about doing the whole lot in native Android.

AlejandroSilvestri commented 2 years ago

I am not one of the authors of ORB-SLAM, it would be better to ask one of them (e.g. @AlejandroSilvestri).

No, I'm not! Glad to answer if I can, but I didn't work with Raul Mur, the main author of ORB-SLAM and ORB-SLAM2.

AlejandroSilvestri commented 2 years ago


All ORB-SLAM (the three of them) aim to speed. It's impossible to speed them up by tweaking. There were many visual slam alternatives based on orb-slam, with variations. None of them really improved speed.

AlejandroSilvestri commented 2 years ago


Did you succeeded at camera calibration? Here are some tips for a cellphone camera:

AlejandroSilvestri commented 2 years ago

@nickw1 and @carlodek

For performance, keep image resolution at hd, don't use 50 Mpixels images. I believe you already know this.

Native can be a lot better than wasm, and easier to migrate.

nickw1, what lag do you have by sending images to server?

Some students and me are extracting keypoints and descriptors at the cellphone, and sending them to the server to reduce lag and bandwidth, and it works very well. But we aren't working with orb-slam3 on that one.

carlodek commented 2 years ago

Hi @AlejandroSilvestri ,

thanks a lot for tricks about camera, wasm is really faster on my mac. Now I will try tricks for android.

nickw1 commented 2 years ago

I am not one of the authors of ORB-SLAM, it would be better to ask one of them (e.g. @AlejandroSilvestri).

No, I'm not! Glad to answer if I can, but I didn't work with Raul Mur, the main author of ORB-SLAM and ORB-SLAM2.

OK sorry! I assumed you were, many apologies.

nickw1 commented 2 years ago


Did you succeeded at camera calibration? Here are some tips for a cellphone camera:

* calibrate only k1 and k2 distortion parameters: keep p1, p2 and k3 and on at zero.

* check calibration results: central point should be near the center of the image (resolution / 2)

* fx similar to fy, and roughly around 0.7 of x resolution

@AlejandroSilvestri not yet, unfortunately. I tried both the calibrator you linked to above, and the "official" OpenCV Android calibrator sample using print-outs of the "official" OpenCV calibrator patterns. Thank you for the tips though! :-)

nickw1 commented 2 years ago

@nickw1 and @carlodek

For performance, keep image resolution at hd, don't use 50 Mpixels images. I believe you already know this.

Native can be a lot better than wasm, and easier to migrate.

nickw1, what lag do you have by sending images to server?

Some students and me are extracting keypoints and descriptors at the cellphone, and sending them to the server to reduce lag and bandwidth, and it works very well. But we aren't working with orb-slam3 on that one.

Thanks for the tips. As I mentioned above, I am also experimenting with native. I don't get too much lag, not enough to worry about. Performance seems quite good, it's the tracking that's the issue.

I believe that there were some possible memory alignment issues in Eigen, could this be a potential problem on a different platform?

AlejandroSilvestri commented 2 years ago

OK sorry! I assumed you were, many apologies.

Don't worry, on the contrary, I feel honored!

carlodek commented 2 years ago

Hi @nickw1, just an update about cmd for emscripten. I've decreased MAXIMUM_MEMORY to 1GB cause 4GB was causing crash on iphone. Stupid iphones :D.. But it gave me a good suggestion cause this wasm to work does not deserve 4GB.

nickw1 commented 2 years ago

@carlodek thanks!

nickw1 commented 2 years ago

I have now got some more detailed debug information on the memory alignment issue, this is on a desktop (still struggling to get any initialisation occurring on a mobile device).

It's a threading/synchronisation issue by the looks of things. The line throwing the alignment error appears to be a line in SetCurrentCameraPose() which creates a lock on the call to Tcw.clone() (but the error is thrown by the lock creation, not by the call to Tcw.clone()).

void MapDrawer::SetCurrentCameraPose(const cv::Mat &Tcw)
    unique_lock<mutex> lock(mMutexCamera); // THIS LINE
    mCameraPose = Tcw.clone();

Stack trace is:

RuntimeError: operation does not support unaligned accesses
    at a_cas.5 (atomic_arch.h:54:2)
    at __pthread_mutex_lock (pthread_mutex_lock.c:6:6)
    at std::__2::__libcpp_mutex_lock(pthread_mutex_t*) (__threading_support:410:10)
    at std::__2::mutex::lock() (mutex.cpp:33:14)
    at ORB_SLAM3::MapDrawer::SetCurrentCameraPose(cv::Mat const&) (MapDrawer.cc:406:29)

I'm a bit puzzled as to why creating a lock would cause an unaligned access issue. Anyone got any ideas? Thanks.

@carlodek is your code available (as you are not getting this error, just wondering if I could try yours and see if I get the same result). If it's closed-source / otherwise not available though, no worries!

carlodek commented 2 years ago

Hi @nickw1, my code right now is not available but I would like to help you. Remember that I'm working on a m1 mac, you I suppose are working on linux, don't know how many differences could have.

//Initialize the Tracking thread //(it will live in the main thread of execution, the one that called this constructor) this is the comment fo tracking initialization.. Maybe you forgot this parameter-> -s PROXY_TO_PTHREAD? Reading on emscripten website this will take alive your main, where you are instantiating System class.

nickw1 commented 2 years ago

@carlodek OK thanks, will give that a try.

nickw1 commented 2 years ago

@carlodek have just tried that flag. Seems to have fixed it! Many thanks! :-)

Puzzled as to why I got an "unaligned access" error, though; from my understanding this flag just makes the browser more responsive by running main() on a separate thread.

carlodek commented 2 years ago

Hi @nickw1 , sounds great!! Yes my idea about why is the same.

nickw1 commented 1 year ago

How many people are aware of this?


Looks like it is a working project - it has included ORB-SLAM plus other projects to produce working web AR.

AlejandroSilvestri commented 1 year ago


Thank you for the link. AlvaAR uses OV2SLAM, I'll read it.