Float precision issue - Githubissues

IBMDecisionOptimization / docplex-examples

These samples demonstrate how to use the DOcplex library to model and solve optimization problems.

https://ibmdecisionoptimization.github.io/

Apache License 2.0

396 stars 228 forks source link

Float precision issue #46

Closed AndreMaz closed 3 years ago

AndreMaz commented 3 years ago

I'm facing Python related floating point issues. I'll try to explain what I mean.

I have the following constrain

c3: 0.2 B_0_0 + 0.3 B_1_0 <= 0.5

which would be ok if both B_0_0 and B_1_0 were equal to 1 because 0.2 + 0.3 <= 0.5

However, due to Python's floating point arithmetic docplex generates the following constraint

# generated with model.export_as_lp()

c3: 0.200000002980 B_0_0 + 0.300000011921 B_1_0 <= 0.500000000000

which makes impossible for the B_0_0 and B_1_0 to be 1 because 0.200000002980 + 0.300000011921 <= 0.500000000000 is not true.

Is there a way to control float precision during constraints declaration? or should I just re-scale the constants?

PhilippeCouronne commented 3 years ago

Hello,

This is not related to Python:

This simple program:

from docplex.mp.model import Model
mm = Model()
b1, b2 = mm.binary_var_list(keys=['b1', 'b2'], name=str)
mm.add(0.2 * b1 + 0.3 * b2 <= 5)
mm.maximize(b1+b2)
mm.solve(log_output=True)
mm.print_solution()

solves with b1=b2=1 with no issue. Your problem is a serialization problem: it is well-known that LP serialization may lose precision. To serialize your model with no loss of precision, you should use the SAV format (binary) See Model.export_as_sav or Model.export_as_savgz for larger models (same format, compressed)

This said, on my Windows machine, printing the LP string yields a correct text (those precision issues are platform dependent)

AndreMaz commented 3 years ago

Thank you for quick response and for the explanation.

Your problem is a serialization problem: it is well-known that LP serialization may lose precision.

I'm only using model.export_as_lp() for debugging purposes, i.e., to see that all the constraints are ok. For solving I'm using model.solve(). I don't know what format docplex uses to send the problem into CPLEX solver but the precision issue is definitely there.

This said, on my Windows machine, printing the LP string yields a correct text (those precision issues are platform dependent)

I'm using a Linux machine so that might be the issue.

Anyway, I'm going to try to re-scale the constants to avoid this kind of issues

PhilippeCouronne commented 3 years ago

What you see from Model.export_as_lp() has been processed by Python print instructions; it may not be the exact copy of what has been entered and sent to CPLEX engine. Nevertheless, the DOcplex API guarantees that what is sent to CPLEX is exactly the numbers you entered, though LP serialization might differ. In other terms, serializing a model then reading it again (with docplex.mp.model_reader.read_model , might not give the same model you started with (only SAV format guarantees that).

Note that you can print the string representation of any constraint, call the str function on the constraint to get it.

My question is: does the Python model (0.2 x + 0.3 y <= 5) solve OK? if not, this would be strange and worth investigating.

AndreMaz commented 3 years ago

My question is: does the Python model (0.2 x + 0.3 y <= 5) solve OK? if not, this would be strange and worth investigating.

The example that you've provided works as expected.

I'm using tensorflow to generate the constants, which in my problem are between [0 - 1) with 2 digit precision. Here's an example:

import tensorflow as tf

data = tf.cast(
    tf.random.uniform((2, 5, 3), minval=0, maxval=100, dtype="int32") / 100,
    dtype="float32"
)

This gives me the following output:

<tf.Tensor: shape=(2, 5, 3), dtype=float32, numpy=
array([[[0.78, 0.99, 0.07],
        [0.61, 0.43, 0.48],
        [0.09, 0.25, 0.04],
        [0.03, 0.39, 0.64],
        [0.64, 0.08, 0.66]],

       [[0.92, 0.69, 0.86],
        [0.42, 0.71, 0.59],
        [0.08, 0.04, 0.58],
        [0.57, 0.28, 0.53],
        [0.5 , 0.92, 0.36]]], dtype=float32)>

However, when I use these values in docplex I get the precision issue that I've mentioned.

Anyway, I've managed to solve the issue by re-scaling the constants between [0 - 100) and now all of my unit tests are green.

PhilippeCouronne commented 3 years ago

From the code above, I guess your random numbers are generated via numpy, actually. I also see you are specifying dtype=float32. CPLEX expects 64-bits Python floats, so DOplex converst all its numeric arguments to Python float (otherwise CPLEX would crash badly). I suspect your issues come from this conversion, and would disappear if you specified dtype=float64.

AndreMaz commented 3 years ago

Yeah, under the hood tensorflow uses numpy for this kind of ops. I do tf.cast() to dtype=float32 because my neural nets are working with float32.

I suspect your issues come from this conversion, and would disappear if you specified dtype=float64.

You are right. Tested with float64 and got the following constraint:

c3: 0.200000000000 B_0_0 + 0.300000000000 B_1_0 <= 0.500000000000

The results are all good. Thanks for your help :+1:

BTW great job on the lib. It's really awesome.

You can close the issue.

PhilippeCouronne commented 3 years ago

You're welcome!