dkogan / mrcal

Next-generation camera-modeling toolkit
http://mrcal.secretsauce.net
Apache License 2.0
190 stars 15 forks source link

Building for Windows with MSVC #12

Closed mcm001 closed 3 months ago

mcm001 commented 10 months ago

I realize this is a pretty huge ask but putting it out there anyways. Since a lot of our users run Windows it would be awesome to be able to support mrcal and all the related visualization tools on that platform too. We compile all our other libraries with MSVC right now, so I'm not sure if mingw-gcc would play particularly nicely. I started hacking away at things just to see how hard it would be to make work (link below + some other changes needed to libdogleg), but I'm curious to hear your thoughts.

https://github.com/dkogan/mrcal/compare/master...mcm001:mrcal:master

dkogan commented 10 months ago

Hi. I'll help with whatever porting issues you encounter, but you're going to have to do all the actual work on this one. What if you target wsl? Wouldn't it make most of the problems disappear?

For the visualization, gnuplot is the backend, and it DOES support windows. The wrapper libraries had multiple bug reports over the years saying that they are broken on windows, but it sounds like all that needs to happen is to find the win32 flavor of select(). I'm sure that's not difficult

dkogan commented 10 months ago

I just looked at your work tree. It looks like you're effectively trying to re-architect big chunks of this: moving the build system to cmake, moving the main sources from C to C++ and so on. I wouldn't take that route because

But this all doesn't seem necessary, though. You can get most of the dependencies (make, gcc, suitesparse) working natively on windows. Even if you want to stick with the msvc compiler for whatever reason, surely it can handle C99 code? This feels like not the most expedient way to get where you want to go.

mcm001 commented 10 months ago

A lot of mrcal and libdogleg use GCC-specific compiler extensions that are unsupported by MSVC. Most problematically, variable length arrays and nested functions. It seems like Clang might be able to handle this, and generate MSVC-abi-compatible DLLS and such as well?

And yes WSL is a good option. It works great for me for developing code. But for distributing tools to users, it's a lot lower friction for us to be able to distribute an exe instead of asking people to install WSL and configure USB camera passthrough (which doesn't even always work? My luck with V4L and WSL hasn't been great) and such.

So in conclusion:

mcm001 commented 10 months ago

Regarding build systems. I haven't tried but I suspect Mrbuild wouldn't Just Work in windows without some effort?

dkogan commented 10 months ago

OK. You know far better than me about the details here. But I do think it's worth it to spend the time to figure out the least-diverging way to do this. If you have to make lots of changes to get this working, then it will be effortful for you to update your code to use newer mrcal releases, when they happen.

I'm all for merging stuff to make this more portable, but moving to c++ in the name of portability is just perverse: if msvc truly cannot compile C code, then I'm fine not supporting it. But I find that really hard to believe. There isn't some drop-down list in some project-preferences dialog somewhere to select the language? Where you can select C99? I think I use some nonstandard extensions in a few places, but I think everything builds with gcc and clang (haven't tried in a while with clang, admittedly). I'd be open to removing those nonstandard usages potentially, where "standard" is defined as C99.

Actually, I just tried it again. Apparently clang doesn't have the nested-function extension. I can remove those usages.

A lot of mrcal and libdogleg use GCC-specific compiler extensions that are unsupported by MSVC. Most problematically, variable length arrays and nested functions.

VLAs are standard C99 I believe. Nested functions are not. I'm open to removing the nested functions, but let's agree on the rest first.

Actually, some quick searching tells me that MSVC truly isn't a C compiler, and that they truly don't support C99. If that's the case, then I strongly suggest that you use gcc or clang.

It seems like Clang might be able to handle this, and generate MSVC-abi-compatible DLLS and such as well?

I don't know nearly enough here. When you say you wan to use MSVC, are you specifically talking about the compiler? The IDE? Both? If all you care about is that you produce a native binary in the end, I'm certain you can use gcc and clang (both are available for windows). And I'm sure both support the windows-native ABIs. Maybe you can even plug them into the MSVC IDE, if you really want that?

  • We use other libraries compiled using MSVC, so need to produce MSVC-compatible DLLs for libdogleg and mrcal

OK. What about their dependencies? Everything is available somewhere? Where are you getting the suitesparse binaries?

  • Clang claims to support those?

Are you certain that you need some special ABI that's not supported by gcc? What ABI does the gcc-for-windows use then?

Regarding build systems. I haven't tried but I suspect Mrbuild wouldn't Just Work in windows without some effort?

mrbuild is just a bit of GNU make. You can get builds of GNU make for windows. Can you try it? I can imagine some things breaking, but 99% of it should be fine.

So in summary: if I were you I'd start out by trying to get it building native binaries with GNU make and mrbuild and gcc, with the code as it is. If that cannot work, I'd want to get 100% clarity about why not. Maybe you're there already.

If the problem is that gcc doesn't support something you need (like some special ABI), but clang does, then tell me. I can patch stuff to make it build with clang, and you (and everyone else!) will use that tree from that point on.

If you need to convert stuff to C++ or cmake, then I'm just not likely to accept those patches, and then you'll have a fork. I'd like to avoid that, if possible.

Let me know how I can help

mcm001 commented 10 months ago

Sorry, I wasn't 100% precise. VLAs are part of the C99 standard, but MSVC isn't fully standards compliant. I briefly tried installing Clang through Visual Studio, and compiling a test program using a VLA, but looks like that doesn't work either? Which would limit us to GCC through mingw or similar.

PS D:\Documents\GitHub\libdogleg> & "C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.33.31629/bin/Hostx64/x64/cl.exe" .\test.c
Microsoft (R) C/C++ Optimizing Compiler Version 19.33.31630 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

test.c
.\test.c(2): error C2057: expected constant expression
.\test.c(2): error C2466: cannot allocate an array of constant size 0
.\test.c(2): error C2133: 'arr': unknown size

I'm using right now these random pre-compiled binaries, but it seems possible to also compile using VCPKG too.

I don't have any docs to back me up, but I've read online and have anecdotes from friends about ABI incompatabilities between binaries compiled using GCC and MSVC (specifically they were trying to work with ipopt/casadi, which need fortran compilers? Not 100% on the details there). And yeah, by MSVC I specifically mean the compiler.

I can give mrbuild a try, but my main goal here was to figure out how painful the build system setup will be on Windows, and CMake was the lowest friction path to getting that to happen for me.

dkogan commented 10 months ago

OK. If MSVC is primarily a C++ compiler, and doesn't do C99 (which is the sense I'm getting), then I would use that here only as a last resort. I would only undertake MSVC support with the upstream maintainer's cooperation, and this upstream maintainer is cantankerous :)

Today it looks like I don't build with clang on any platform. But that's something I'm happy to fix. I'm not familiar with mingw. Is clang more "native" on windows than gcc in some significant way? If today you can get away with gcc (thus not requiring me to fix clang support right now) that would be great. But if not, tell me, and I'll fix it.

In the example above you're running cl.exe, which isn't actually clang. clang should support VLAs, since those are C99.

As for the build, since you're not building the Python stuff, the build system will be trivial, regardless of the tooling. You shouldn't see problems with mrbuild, but whatever you do is fine. Just don't ask me to merge your cmake stuff :)

mcm001 commented 10 months ago

I'm not sure I would agree that MSVC is primarily a C++ compiler? Anyways, I spent some time over the weekend playing with vcpkg and clang. Vcpkg at least gets me a fairly clean way to install suitesparse and BLAS/LAPACK.

I confirmed Clang supports VLAs and empty structs, but not inner functions. I'll continue trying to figure out my local build setup -- you're right, at the end of the day there isn't a whole lot going on in the build system, it's mostly just fighting with Windows.

If you're open to additionally patching mrcal and libdogleg to be buildable with MSVC by moving away from the two missing features from above, that would be incredible and ideal from my side. I need to confirm that VLAs and empty structs are the only two blockers that would prevent us from being able to build with MSVC, and we would be able to avoid moving to C++.

FYA, I did get decently far with my fork. I was able to replace nearly all VLAs with C++ vectors, but haven't yet nailed down a weird bug that seems to be corrupting memory inside of the callback that projects points. Not expecting any support there from you obviously, just wanted to note I was trying to follow the pointer chain and figure out where my patch went wrong.

dkogan commented 10 months ago

I'm not sure I would agree that MSVC is primarily a C++ compiler?

That's just the sense I get from searching the web. Might be wrong. But if it can do C, but not C99, then it's a half-assed C compiler at best.

I confirmed Clang supports VLAs and empty structs, but not inner functions.

Yep. I'm fine making this work with clang (I thought I already did this, to be honest). So let's say I'll do that in the near future.

If you're open to additionally patching mrcal and libdogleg to be buildable with MSVC by moving away from the two missing features from above, that would be incredible and ideal from my side.

Nested functions are mostly a readability thing, although they retain the context of the parent, making it really convenient to never have to pass "void* cookie" to any functions. I don't recall how much of that I'm actually doing in in this project. Possibly not at all.

Variable-length arrays, however, allow you to quickly and easily get small buffers of memory without any dynamic allocation. Since getting rid of this would require medium amounts of typing, and would produce objectively worse code AND since it's standard C99, I don't want to get rid of them. Once again, I don't recall how pervasive these are in this project, but I'll look. I still think that you should use a different compiler, though.

I need to confirm that VLAs and empty structs are the only two blockers that would prevent us from being able to build with MSVC, and we would be able to avoid moving to C++.

I'm not sure what you mean with "empty structs". Can you point to a chunk of code I'm using that has this problem.

Also, how will you get rid of VLAs if I don't do it? You'll malloc()/free() each time?

FYA, I did get decently far with my fork. I was able to replace nearly all VLAs with C++ vectors, but haven't yet nailed down a weird bug that seems to be corrupting memory inside of the callback that projects points. Not expecting any support there from you obviously, just wanted to note I was trying to follow the pointer chain and figure out where my patch went wrong.

Cool. Sounds like you're close. In linux land there's this, which would dramatically reduce your debugging time: https://www.rr-project.org There's probably a windows equivalent?

dkogan commented 10 months ago

Hello. I made a branch for the 2.4 release: https://github.com/dkogan/mrcal/tree/release-2.4

This will have some fixes, in particular, it will build with clang, and it will fix the memory leak you found. I just un-nested the functions in the C library, so libmrcal.so now builds with clang. The python-wrapping needs more work still, but is this enough to fix your problem? I think you were saying that building with clang would be enough for your use case, or did I misunderstand, and you can only use MSVC specifically? If so, then you still will need to deal with the other issues (empty structs and VLAs)

mcm001 commented 9 months ago

Awesome, thank you for doing that! And yes, no python is fine for us right now for Windows. I'll LYK if that changes.

I was also able to get mrcal to build using Clang-cl from Visual Studio 2022! Some notes:

#ifdef __GNUC__
#define MRCAL_DLLEXPORT __attribute__((dllexport))
#else
#define MRCAL_DLLEXPORT __declspec(dllexport)
#endif

MRCAL int some_mrcal_thing();

I used this bat script to build:

clang-cl /LD /MT -v -I D:\Documents\GitHub\vcpkg\installed\x64-windows\include\suitesparse -I ../libdogleg mrcal.c cahvore.cc poseutils-opencv.c poseutils-uses-autodiff.cc poseutils.c mrcal-opencv.c D:\Documents\GitHub\vcpkg\installed\x64-windows\lib\libcholmod.lib D:\Documents\GitHub\vcpkg\installed\x64-windows\lib\suitesparseconfig.lib D:\Documents\GitHub\vcpkg\installed\x64-windows\lib\lapack.lib ../libdogleg/dogleg.lib

clang-cl -o mrcal_test.exe test_mrcal.c mrcal.lib

SET PATH=%PATH%;D:\Documents\GitHub\vcpkg\installed\x64-windows\bin
start mrcal_test.exe
mcm001 commented 9 months ago

Here's the commit with all my changes: https://github.com/dkogan/mrcal/commit/98cd1f245f8ca9657d0c428d6957b3021dcd1a50

In terms of what I'd like upstreamed, i think it's just finding a fix for M_PI and adding DLLEXPORTs to mrcal.h?

dkogan commented 9 months ago

OK, so for M_PI, I think you just need one extra define:

https://godbolt.org/z/1jY8TP515

It hurts nothing, so I'm happy to add it. Let me add it (to libdogleg master branch and mrcal release-2.4 branch) right after this. For the library export business, you need that extra thing for mrcal only, or for libdogleg also?

dkogan commented 9 months ago

I didn't remember how I did this in the two projects (mrcal, libdogleg), but I just looked. In both I use the default visibility settings: the symbols are exported by default, unless explicitly marked otherwise. It looks like there's exactly one function across both projects that does something other than the default: project_cahvore_internals() in cahvore.cc. Maybe it can be modified to not do that (msvc probably won't like the __attribute__), but let's first get all the other functions working properly.

Clearly the default setting is inverted in msvc: they hide the symbols by default. You should poke around the project settings to flip the visibility default to what the GNU system does. I can't find any obvious notes about what to do, specifically, but you have the thing in front of you, so you can go poke around the dialogs. Look at the linker options; stuff about "visibility" or "exports" or something.

dkogan commented 9 months ago

There's a cmake property to do the magic thing: https://cmake.org/cmake/help/latest/prop_tgt/WINDOWS_EXPORT_ALL_SYMBOLS.html

You can either use cmake, or you can make a test cmake project with that option, see what it does, and apply it to your visual studio project. I really don't want to be manually annotating each one of my symbols.

mcm001 commented 9 months ago

Looks like it auto-generates a .def file, and uses that when building to export symbols? I agree that it's pretty annoying for you.

Since I'm happy to DIY my own build system, I think it's actually probably easier for me to just add mrcal and libdogleg as submodules and build that + my JNI at the same time so we never even have to worry about symbol exporting? Speaking of, mrcal is alive on Windows now! Haven't compared results to Linux, but it's passing my unit tests still so that's good image

dkogan commented 9 months ago

Nice! Maybe I'm being needlessly stubborn. Sometimes it IS useful to have an export whitelist instead of blacklist, and I don't have all that many exported symbols anyway. Let me think about it.

How did you end up dealing with the other issues? Are you using clang now, or msvc + patches?

mcm001 commented 9 months ago

That screenshot is using clang-cl from Visual Studio. I pulled the latest from the mrcal's release-2.4 (244ef97ba) and libdogleg's master (1f88d01) to get the m_pi patches. I'm building using this little script, which is on the master branch of my mrcal-java repo

"C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\Llvm\x64\bin\clang-cl.exe" /EHsc /LD /MD /std:c++20 -I ../mrcal -I D:\Documents\GitHub\vcpkg\installed\x64-windows\include\ -I D:\Documents\GitHub\vcpkg\installed\x64-windows\include\suitesparse -I build\_deps\opencv_header-src -I C:/Users/matth/.jdks/liberica-11.0.7/include -I ../libdogleg -I C:/Users/matth/.jdks/liberica-11.0.7/include/win32 mrcal_jni.cpp mrcal_wrapper.cpp  D:\Documents\GitHub\vcpkg\installed\x64-windows\lib\libcholmod.lib D:\Documents\GitHub\vcpkg\installed\x64-windows\lib\suitesparseconfig.lib D:\Documents\GitHub\vcpkg\installed\x64-windows\lib\lapack.lib ../libdogleg/dogleg.lib build\_deps\opencv_lib-src\windows\x86-64\shared\opencv_core460.lib build\_deps\opencv_lib-src\windows\x86-64\shared\opencv_calib3d460.lib ../mrcal/mrcal.c ../mrcal/cahvore.cc ../mrcal/poseutils-opencv.c ../mrcal/poseutils-uses-autodiff.cc ../mrcal/poseutils.c ../mrcal/mrcal-opencv.c 

@REM "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\Llvm\x64\bin\clang-cl.exe" /MT -o mrcal_test.exe ../mrcal/test_mrcal.c mrcal_jni.lib

I'm installing suitesparse using vcpkg and downloading opencv libraries from a wpilib build of opencv. It's all pretty cursed right now, tis but a proof of concept. I end up needing to put these DLLs onto my PATH as well for Java to be able to load the library:

matth@LAPTOP-F96B9E90 MINGW64 /d/Documents/GitHub/mrcal/allthedlls (release-2.4-windows)
$ ls
libamd.dll*      libgfortran-5.dll*      opencv_core460.dll*
libcamd.dll*     liblapack.dll*          opencv_features2d460.dll*
libccolamd.dll*  libquadmath-0.dll*      opencv_flann460.dll*
libcholmod.dll*  openblas.dll*           opencv_imgproc460.dll*
libcolamd.dll*   opencv_calib3d460.dll*
mcm001 commented 9 months ago

I am also curious to see if I can avoid loading opencv. We load it separately already, so if there's an opencv_core460.dll loaded and we try to load mrcal_jni.dll, maybe it'll be happy to use the one already in memory? Would cut down the size of stuff I gotta ship a fair bit.

I'm also curious to see if we can statically link to lapack/blas and friends on Linux. It would reduce the number of libraries people need to install on their system.

And now that this at least sort of works, I'm curious to see some proper speed benchmarks for Linux vs windows since I think I'm using openblas above

dkogan commented 9 months ago

Good news! You don't actually need opencv. Look at the Makefiles more closely. None of them us it. You definitely do need blas and lapack and suitesparse (cholmod specifically, and whatever it needs). Static linking could work. You might lose some performance, compared to a hardware optimized BLAS, but it might be worth a try.

dkogan commented 9 months ago

Isn't there some form of package manager on Windows? Chocolatey? Could you get dependencies from that?

mcm001 commented 9 months ago

Good news! You don't actually need opencv.

I know mrcal doesn't, but my JNI code uses cv::solvePNP as the initial board orientation guess. If you have a function in mrcal to estimate that for me though it would be awesome to delete that dependency.

Chocoltey or something?

Yeah, that and Vcpkg both claim to be package managers for Windows. Best I can tell, Vcpkg has good cmake integration, but not much else? It's worth looking closer into chocolatey though I think.

dkogan commented 9 months ago

Matt @.***> writes:

Good news! You don't actually need opencv.

I know mrcal doesn't, but my JNI code uses cv::solvePNP as the initial board orientation guess. If you have a function in mrcal to estimate that for me though it would be awesome to delete that dependency.

Oh, then you need it. The mrcal python layer calls opencv in a few places, for stuff like this.

mcm001 commented 9 months ago

Quick update: looks like Chocolatey does NOT have cholmod or suitesparse? Vcpkg does though. And I know how to make Vcpkg work the Right Way, I think.

dkogan commented 9 months ago

An update you may/may not care about. I just released numpysane 0.40. This removes nested functions from the code it generates, which means that all of mrcal (including the Python layer) builds with clang.

mcm001 commented 9 months ago

Oh awesome! That's huge. That should mean we should be able to build/ship all of mrcal for windows now. I'm focusing on other stuff at the present, but would love to revisit this to try building Windows wheels at some poitn

mcm001 commented 9 months ago

FYA, mrcal has made it into Photon! We also have some docs on our website, still working on building those out. Thanks again for all your help! Happy to close or leave open for ongoing windows python stuff

dkogan commented 9 months ago

Excellent! Thanks for doing all that!

If you're still working on the windows stuff, you can keep that open, and ask me stuff as you encounter problems. I believe the current state is that you think full support for clang is sufficient to get this working in Windows, but you need to fully test that still; yes?

I'm hoping to get mcal 2.4 out in a month or so. This will have the various fixes we've been talking about, such as the clang support. If you'd like anything else to make it into that release, tell me.

mcm001 commented 9 months ago

Yes, I think clang is sufficient for building everything + python wheels on Windows, but I haven't yet had time to dig into it (and probably won't for a while).

Re: 2.4: how much mrcal-side work is it to expose what I'd need to make projection uncertainty graphs in the C API? It looks like there's a bunch of math that lives in Python right now.

mcm001 commented 9 months ago

Oh and while I have you, curious to hear any feedback you have on our docs. They're intentionally pretty sparse (since your walk-though is a much better resource! And once people have a cameramodel, they can go investigate using the usual mrcal tooling), but I want people to have a good idea of what they should be taking away from looking at graphs & how they can actually improve their calibration images.

dkogan commented 9 months ago

Re: 2.4: how much mrcal-side work is it to expose what I'd need to make projection uncertainty graphs in the C API? It looks like there's a bunch of math that lives in Python right now

That's a lot of work, and I'd need to be convinced that it's worth doing at all. DEFINITELY will not happen for 2.4

dkogan commented 9 months ago

Re: docs. I will definitely look at them, but not just at this moment. Thanks again for working on this.

mcm001 commented 9 months ago

My main motivation for projection uncertainty graphs was that I wanted a way to display them in Photon, which would mean they'd need to be able to be generated in some mix of our C->Java->Javascript stack. But to take a step back, all I really want is to help visualize results to help users tell if their calibration actually makes sense; if projection uncertainty isn't something that you think is helpful to show, or if only supporting visualization through mrcal's python tools are reasonable asks of users then it's fine to drop.

Otherwise, no asks from me on features for 2.4! We're working next on implementing splined stereographic models, but everything I need to make that happen is already exposed by mrcal's C API.

dkogan commented 8 months ago

I'm not against supporting everything in C, in principle. It's just a lot of work to do in the first place, and makes maintaining and extending the code in the future much harder. Bit by bit I AM migrating the internals to C (dense stereo processing is now possible in C only, for instance), but only for stuff that is useful to have in C and that isn't likely to change in the future. mrcal 3.0 will have a very different uncertainty quantification algorithm, so that isn't fully stable yet, and I don't want to move it to C just yet.

I think you should call the Python stuff from C. You can either talk to a child Python process, or use the Python C API (which is quite good). This sounds weird, but you won't do this very much, so the performance cost should be low, and it should be relatively straightforward to do.

dkogan commented 8 months ago

I just looked at the calibration docs. They're good. The biggest complaint I have is that it's a LOT of information, with only some of it being applicable, and it's the reader's job to figure out what is important. It would look intimidating to a student, I suspect.

Since there's already a LOT of material out there describing the calibration process (in the mrcal docs and elsewhere), I would simplify the photonvision docs as much as possible, with links to outside docs to provide detail.

So, for instance, when you talk about interpreting the mrcal results, I would say ONLY that it can provide detailed feedback, and I would show the uncertainty plot, and I'd say "for more info, click here".

I think you're talking about 3 different methods of computing a calibration: calibdb.net, photonvision, mrcal. I think this document would be more useful, if there was only ONE method described; in this case the photonvision implementation is probably the one to focus on.

And so on. Furthermore, the mrcal documentation isn't set in stone. If you think some of it could be clearer, I'm happy to change it.

dkogan commented 8 months ago

Hello! I just released 2.4. FYI.

dkogan commented 3 months ago

I think this is all done. I'm going to close this. Open a new report if more issues arise. And thank you for pushing all this through!

mcm001 commented 3 months ago

And thank you for helping me push this through as well! We unfortunately don't have usage reporting statistics from FIRST themselves this season, but our GitHub download stats on our last in-season release are sitting at north of 2,000 not necessarily unique downloads, and mrcal was enabled as the default calibration backend all season. Super fantastic to be able to bring mrcal to so many teams.

We just additionally made some small tweaks to unlock charuco support through our JNI as well :)

dkogan commented 3 months ago

Good to hear! I have server stats too. Definitely saw a big bump when the new photonvision was released.