Float ALL the doubles - Githubissues

ayberkozgur commented 9 years ago

Solves issue #57.

There are unsolvable cases though, e.g cv::arcLength() returning always double.

One serious issue that I wasn't able to solve is that cv::solvePnP() returns double (i.e CV_64F typed cv::Mats) in output arguments rotation and translation. When forced a cv::Mat of floats, it throws:

OpenCV Error: Assertion failed (mtype == type0 || (CV_MAT_CN(mtype) == 1 && ((1 << type0) & fixedDepthMask) != 0)) in create, file /home/equilibrium/src/opencv/modules/core/src/matrix.cpp, line 2243

It seems that they really didn't think some parts through. So for now, I'm just converting to float during transform matrix filling.

severin-lemaignan commented 9 years ago

Were you able to spot any performance gain on Android?

ayberkozgur commented 9 years ago

I wouldn't expect any from such few changes.

severin-lemaignan commented 9 years ago

It is then worth the change?

ayberkozgur commented 9 years ago

Yes, we don't gain anything from using doubles (AFAIK) and if in the future e.g cv::solvePnP() is separated into cv::solvePnPd() and cv::solvePnPf() we would be left behind.

ayberkozgur commented 9 years ago

This PR at least makes everything float instead of using double in some places and float in others. In the future we could go for #ifdef ANDROID typedef creal float #else typedef creal double #endif.

ayberkozgur commented 9 years ago

I've confirmed Qimchi demos and NCCR code to work just as before on desktop and Android.

qbonnard commented 9 years ago

Of course it works on the demos, if it passes your beloved tests ;)

A possible regression concerns the precision performance. That's reason number 39 to do the reprojection-based precision tests... It shouldn't take so long, I hope I'll be able to finally do that this week end. I suppose your PR can wait a bit ?

Regarding the speed performance, what about the NEON liking floats so much better, as promised by @severin-lemaignan in #57 ?

I'm not so convinced by hypothetically future-proofing cv::solvePnP, but the lack of consistence between usage of floats and double does bother me. However, I'm under the impression that OpenCV solves this with doubles... doubles everywhere. No ? My feeling from that was that floats are more likely to impact precision than speed...

severin-lemaignan commented 9 years ago

@qbonnard I've updated issue #57 to actually list the relevant flags. Neon vectorization only takes place when using floats. But indeed, this matters mostly for algorithms like canny (ie, within OpenCV). I need to check again if and how OpenCV performs Neon-like optimization.

Another simple yet possibly important optimization: we want to ensure that the image width and height are always multiple of 4. Otherwise, a whole class of optimizations are not possible. I'm going to run a quick check in chilitags to see if enforce multiple at 4 at our level allows for nice optimization at inside OpenCV functions.

ayberkozgur commented 9 years ago

OK here are some benchmark results. This is the code:

    float varf = 1237998.334f;
    double vard = 1237998.334;

    EZP_ENABLE;

    for(int k=0;k<100;k++){
        EZP_START_OFFLINE("FLT");
        for(int i=0;i<1000000;i++){
            varf += 24134.38567f;
            varf /= 432.534f;
            varf = sinf(varf);
        }
        EZP_END_OFFLINE("FLT");
        printf("%f\n",varf);

        EZP_START_OFFLINE("DBL");
        for(int i=0;i<1000000;i++){
            vard += 24134.38567;
            vard /= 432.534;
            vard = sin(vard);
        }
        EZP_END_OFFLINE("DBL");
        printf("%f\n",vard);
    }

    EZP_PRINT_OFFLINE;

On my x86_64 PC, compiled with g++ v4.8.2 with -O3, gives the following results:

EZP: -------------------------------------------------------------------------------
EZP: Name    Average(ms)         Total(ms)           Calls
EZP: -------------------------------------------------------------------------------
EZP:  DBL    38.30               3830.29             100       
EZP:  FLT    32.95               3294.62             100

On the tablets with Exynos 5420 quad-core 32-bit ARM processor, compiled with g++ v4.8 with -O3, gives the following results:

11-14 10:39:27.179: I/EZP(4125): EZP: -------------------------------------------------------------------------------
11-14 10:39:27.179: I/EZP(4125): EZP: Name    Average(ms)         Total(ms)           Calls
11-14 10:39:27.179: I/EZP(4125): EZP: -------------------------------------------------------------------------------
11-14 10:39:27.179: I/EZP(4125): EZP:  DBL    320.35              32034.73            100       
11-14 10:39:27.179: I/EZP(4125): EZP:  FLT    137.05              13704.60            100

On the tablets again, compiled with -ffast-math -mcpu=cortex-a15 -mfpu=neon-vfpv4 -mfloat-abi=softfp -O3, gives the following results:

11-14 10:42:02.139: I/EZP(4157): EZP: -------------------------------------------------------------------------------
11-14 10:42:02.139: I/EZP(4157): EZP: Name    Average(ms)         Total(ms)           Calls
11-14 10:42:02.139: I/EZP(4157): EZP: -------------------------------------------------------------------------------
11-14 10:42:02.139: I/EZP(4157): EZP:  DBL    92.49               9248.80             100       
11-14 10:42:02.139: I/EZP(4157): EZP:  FLT    60.01               6001.03             100

Important thing to note: On my PC, there is actually a difference in the results (-0.683532 vs. -0.683529) when using double and float but on the tablet, the result is always the same (-0.683529). My guess here is that on the tablet, somewhere along the line, double is cast into float and we're just losing time with memory copies and casts while there is no actual benefit in terms of precision.

ayberkozgur commented 9 years ago

By the way, OpenCV seems to use templates in most places but "double everywhere" in others. These others tend to be critical algorithms like solvePnP. This definitely isn't a good idea, for as you see above, it only causes double to float cast/memory copy overhead and does not bring more precision on 32-bit platforms. They should have implemented solvePnP as a template or as a solvePnP, solvePnPf pair.

ayberkozgur commented 9 years ago

Apparently not all CV API is "double everywhere", check this out:

void KalmanFilter::init(int dynamParams, int measureParams, int controlParams=0, int type=CV_32F)

That's the way solvePnP should have been.

ayberkozgur commented 9 years ago

I made Chilitags3D a template class which takes the float precision as parameter. Why didn't I do the same with Chilitags? Two related reasons:

cv methods (like cv::boundingRect) enforces float.
Even if they weren't enforced in such places, there is still no place to use double precision with any benefit: There are no exponentiation, no division by small values, no multiplication and addition with unusually large values etc. The only exception to this might be Filter's weighted addition but we can worry about that later. Note: We should also be careful about the perspective transforms in ReadBits.

A sample will follow soon.

ayberkozgur commented 9 years ago

Added a float vs. double test. Runs for each test image and fails if transform matrices of float/double have different elements (up to 10^-5%). Currently, it passes without problems. If you want to make it fail, change the threshold to something impossible for float like 10^-8.

qbonnard commented 9 years ago

Added a [...] test. :heart: :heart: :heart:

So I finally took the time to review this PR and make the reprojection based test on Chilitags::find. The latter shows no difference before and after this PR, so I guess it's all fine. Do you want me to push the reprojection thingie?

I just have two more questions:

Is there a reason for not setting float as default value of the template, so we don't break existing code ? And actually, is there a reason to keep the possibility of having a double ?

-Is your keyboard lacking vowels, or did you mistakenly type your password in the typedefs ? I'm so funny, ain't I. I'm not big fan of TagID2QuadMap, TfMat, Str2TfMap... How about :

TagID2QuadMap -> TagMap ? CornerMap ? ChilitagMap ?
Str2TfMap -> TagMap3D ? TransformMap ? Chilitag3DMap ?
TfMat -> TransformMatrix ? Transform ?
RealT -> RealType ?

ayberkozgur commented 9 years ago

I don't really know what's your reprojection test; can you PR my branch so I can see?

I agree that's a good idea to set float as the default RealT but even then, the existing code will break since you will have to reference the class as Chilitags<> and Chilitags won't work anymore. See http://stackoverflow.com/questions/16014736/avoid-angle-brackets-in-default-template. I don't know why this is so and it sucks.

Ok I can see how you dislike _2_ kind of naming but what's wrong with TfMat and RealT :) ? What TfMat is really clear and easy to read and won't be confused with anything else while T in RealT is a very common notation and will be understood easily.

qbonnard commented 9 years ago

Here is a branch where the reprojection test is added before the float/double thingie https://github.com/qbonnard/chilitags/tree/precisionbefore

And here is one with the commit happening after your changes: https://github.com/qbonnard/chilitags/tree/precisionafter

I didn't find the button to PR your PR with them, I hope that's convenient enough.

For the Chiltags3D<>... arg. Well then, we could either have a Chilitags3D_<>, and a typedef Chilitags3D, or we could just forget about doubles for Chilitags3D. I there a reason not to (beside the time you spent making a template) ?

I just realise the backward compatility argument is quite moot since the return type changes too in Mat44d... hum.

what's wrong with TfMat and RealT :) O boy, there goes my Monday morning again ;)

Let's start with TfMat. Let's take the very likely case that's I'm working on a chilitags-powered Team Fortress mod to add Augmented Reality mattresses. TfMat is exactly the name of my main class !

Seriously though, I'm not sure TfMat is so clear. Ask around you at the lab, maybe I'm wrong. Likert scale and anova or it didn't happen ;)

Also, the Mat part looks like you're trying to put the type of the variable in it's nam, which is evil (I don't remember the name of this practice, which makes namePtr, iWidth, fthreshold). The next step is to have Transform4x4Matrix, or Transform4x4FloatRowFirstMatrix, i.e. Tf44frfMat which happens to be my facebook password.

My turn: what's wrong with fullnames instead of abbreviations ? ;)

ayberkozgur commented 9 years ago

Well I'm probably giving off a wrong impression of myself here: I'm actually one that supports full naming and avoiding unnecessary abbreviations. @severin-lemaignan gave this example as a not unnecessary abbreviation: numErrors. But I guess transform matrices are not as common with everyone as numbers :)

I'm still against changing RealT though, because it would be awkward to have it as RealType or RealTemplate or RealTypename, all of which are very long and painful to type and would clutter the code for increasing readability just a little.

Also, @severin-lemaignan had the valid argument that people are not going to (want to) use the template and we should provide typedefs. For this, I'm making Chilitags3D template Chilitags3D_ so that we can have typedefs without underscores.

About keeping the double precision issue: I really don't want to enforce float in 3D stuff because there's stuff like Kalman filtering coming where there are things like matrix inversions where precision is dangerous.

qbonnard commented 9 years ago

Of course I would usenames like id, HttpRequest or bit instead of identifier, HyperTextTransferProtocol or binary digital number ;) Google's C++ guideline explain it well : https://google-styleguide.googlecode.com/svn/trunk/cppguide.html#General_Naming_Rules

OK, it seems all good, merging in progress ! Thanks again for your patience !

qbonnard commented 9 years ago

Actually, one more thing ;)

Any reason for TransformMatrix and TagPoseMap to be in the chilitags::Chilitags3D_ namespace, while Quad and TagCornerMap are directly in the chilitags namespace ?

I can move all of them in the chilitags::namespace, if it's OK with you.

qbonnard commented 9 years ago

Ah, just got my answer: it needs the RealT parameter...

qbonnard commented 9 years ago

Then I'd go for something like

template <typename RealT = float>
typedef cv::Matx<RealT, 4, 4> TransformMatrix_;

template <typename RealT = float>
typedef std::map<std::string, TransformMatrix> TagPoseMap_;

typedef TransformMatrix_<float> TransformMatrix;
typedef TransformMatrix_<float> TransformMatrixf;
typedef TransformMatrix_<double> TransformMatrixd;

typedef TagPoseMap_<float> TagPoseMap;
typedef TagPoseMap_<float> TagPoseMapf;
typedef TagPoseMap_<double> TagPoseMapd;

for coherency's sake. Does it make sense ?

qbonnard commented 9 years ago

Or maybe just

typedef Chilitags3D_<>::TransformMatrix TransformMatrix;
typedef Chilitags3D_<>::TagPoseMap TagPoseMap;

?

ayberkozgur commented 9 years ago

Ok, after looking at your branch, I finally understand what you mean by reprojection test. But in that code, Chilitags seems to be doing nothing? Anyway, I think this could be useful somehow (in parallel with your test):

Generate artificial tag image, know exact corner locations
Pass image to Chilitags3D<float>, then reproject back using projection matrix
Compare exact corner locations with backprojected corner locations

Note that above, Chilitags3D<double> doesn't make sense because if float is close enough to actual values, double will certainly be (unless float is masking a crazy one in a billion bug).

qbonnard commented 9 years ago

(finally) merged ! Thanks again and again ! I made a new PR to discuss the reprojection test : https://github.com/chili-epfl/chilitags/pull/77

chili-epfl / chilitags

Float ALL the doubles #67