Crypto-TII / CryptographicEstimators

This project gathers and standardize command line scripts to estimate the difficulty of solving hard mathematical problems related to cryptography.
https://estimators.crypto.tii.ae/
GNU General Public License v3.0
34 stars 5 forks source link

Refactor/kat for le estimator #188

Closed Dioprz closed 2 weeks ago

Dioprz commented 1 month ago

Description

Help required

The migration of the LEEStimator has multiple issues for which I need help: @Javierverbel @Memphisd @FloydZ.

I will describe them in the same order I suggest you to look at them: by commit.

Note: Take in count that the LEEstimator have two testing files to migrate:

First commit: Migration of test_le_beullens.sage. Here we have two issues:

  1. I had to soften the checks used at https://github.com/Crypto-TII/CryptographicEstimators/blob/2c97a0ed6c926af8e3481872c46ceb034cad87d4/tests/external_estimators/LEEstimator/test_le_beullens.sage#L37-L41 because we can't check if the internal_estimator (called t1 there) is inf or not in the new framework. For this reason I only check the external estimator, as can be seen here: https://github.com/Crypto-TII/CryptographicEstimators/blob/611a0b9b7fbcedd898e48c4defbee30f659705f8/tests/external_estimators/ext_le.sage#L33-L41 While this doesn't broke anything, it is still a change so let me know if there is a better fix.
  2. Serialized outputs aren't consistent, making that, in some cases, our serializated values doesn't match with the ones calculated by the internal estimator. If you click on the commit, you will see that I sent three .yaml files: good_kat.yaml, failing_kat.yaml and failing_kat_2.yaml. As expected, the prior have values matching the ones calculated by the internal estimator, but the second and third ones don't. (Possibly related: https://github.com/Crypto-TII/CryptographicEstimators/issues/152). If you want to test them, just change the name of any of them by kat.yaml and run make docker-run followed by pytest tests/test_kat.py.

Second commit: Migration of test_le_bbps.sage. Here we another two issues:

1.

This means that attack_cost.sage has some function used by some kat generator of the bbps functions, that isn't defined in cost.sage, and overwrites the default implementation that bbps functions expects.

Probably the easier solution is to just find that function of attack_cost.sage, and change the name by something else (ex. log -> log_) and propagate the change where is appropriate.

2. At https://github.com/Crypto-TII/CryptographicEstimators/blob/2c97a0ed6c926af8e3481872c46ceb034cad87d4/tests/external_estimators/LEEstimator/test_le_bbps.sage#L43-L63 we are correcting the value produced by the external estimator using the internal one. This is not supported at the moment, so we need to solve that too.

Testing tips

Review process

Not ready yet.

Pre-approval checklist

Memphisd commented 3 weeks ago

For the first issue I responded in #152. Essentially lets restrict in both cases to a single execution of the estimators (removing the min) and increasing the tolerance to a higher value (I tested with 1.2, worked in 100 trials). We should include a comment that the higher tolerance compensates for the variance caused by the probabilistic nature of beullens runtime computation.

For the second issue: For 1. maybe @FloydZ can give some feedback on this? I am not sure I yet understand the issue correctly. For 2. I again suggest removing the correction factor and instead increasing the tolerance. I tested with tolerance equal to 1 and had no issues.

FloydZ commented 2 weeks ago

first commit: essential what @Memphisd wrote. Increasing the tolerance is fine, as long as we leave a comment explaining the problem.

second commit: I think the problem is the log2 function in cost.sage, which uses log(), which itself gets overwritten by attack_cost.sage. And then you get the divide by 0 exception. So you can add a if/else catch block in the log2 function to return math.inf in this case.

Dioprz commented 2 weeks ago

Thanks for your comments @Memphisd and @FloydZ .

Now new problems have arisen:

Once I remove the correction factor of the test and apply the second change mentioned above, I'm able to launch a succesful make docker-generate-kat. However I'm getting the next results when running make docker-pytest-kat: (click the items to see the logs)

Output with beullens things in ext_le commented ``` FAILED TEST!!! Input: (200, 100, 17), estimator: bbps_1 Expected: 88.01232020398258, actual: 85.98059676215709, tolerance: 0.1 FAILED TEST!!! Input: (200, 100, 31), estimator: bbps_1 Expected: 97.6016190581889, actual: 95.29228537749462, tolerance: 0.1 FAILED TEST!!! Input: (100, 50, 17), estimator: bbps_range Expected: 50.5492234157521, actual: 64.15127942203596, tolerance: 1 FAILED TEST!!! Input: (100, 50, 31), estimator: bbps_range Expected: 54.9297768841915, actual: 62.916248185548454, tolerance: 1 FAILED TEST!!! Input: (100, 52, 11), estimator: bbps_range Expected: 47.7273094143296, actual: 61.20412310801501, tolerance: 1 FAILED TEST!!! Input: (100, 52, 17), estimator: bbps_range Expected: 50.1438546973854, actual: 64.26858480987197, tolerance: 1 FAILED TEST!!! Input: (100, 52, 31), estimator: bbps_range Expected: 54.8735900962522, actual: 62.993721687450474, tolerance: 1 FAILED TEST!!! Input: (100, 54, 11), estimator: bbps_range Expected: 48.4492902156340, actual: 61.26452020280075, tolerance: 1 FAILED TEST!!! Input: (100, 54, 17), estimator: bbps_range Expected: 49.7482057110599, actual: 64.29502987338061, tolerance: 1 FAILED TEST!!! Input: (100, 54, 31), estimator: bbps_range Expected: 54.7275915865044, actual: 62.9695081119894, tolerance: 1 FAILED TEST!!! Input: (100, 56, 11), estimator: bbps_range Expected: 47.3204667024712, actual: 61.2410428294115, tolerance: 1 FAILED TEST!!! Input: (100, 56, 17), estimator: bbps_range Expected: 49.4885208964825, actual: 64.23134146330354, tolerance: 1 FAILED TEST!!! Input: (100, 56, 31), estimator: bbps_range Expected: 54.0562693722277, actual: 64.8228575835149, tolerance: 1 FAILED TEST!!! Input: (100, 58, 11), estimator: bbps_range Expected: 46.9195479794463, actual: 61.13486999210458, tolerance: 1 FAILED TEST!!! Input: (100, 58, 17), estimator: bbps_range Expected: 48.7358506307938, actual: 62.33448729434238, tolerance: 1 FAILED TEST!!! Input: (100, 58, 31), estimator: bbps_range Expected: 53.6413586233888, actual: 64.6117650452043, tolerance: 1 FAILED TEST!!! Input: (105, 52, 17), estimator: bbps_range Expected: 52.4185199553221, actual: 66.0139155778872, tolerance: 1 FAILED TEST!!! Input: (105, 52, 31), estimator: bbps_range Expected: 57.0927422516923, actual: 66.95427828058989, tolerance: 1 FAILED TEST!!! Input: (105, 54, 11), estimator: bbps_range Expected: 49.9329126089316, actual: 64.33857762258022, tolerance: 1 FAILED TEST!!! Input: (105, 54, 17), estimator: bbps_range Expected: 52.0794634358903, actual: 66.14477029710844, tolerance: 1 FAILED TEST!!! Input: (105, 54, 31), estimator: bbps_range Expected: 57.0905897171837, actual: 67.05818567358023, tolerance: 1 FAILED TEST!!! Input: (105, 56, 17), estimator: bbps_range Expected: 51.6816549508525, actual: 66.18764062658205, tolerance: 1 FAILED TEST!!! Input: (105, 56, 31), estimator: bbps_range Expected: 56.6797880843981, actual: 67.0668392271396, tolerance: 1 FAILED TEST!!! Input: (105, 58, 11), estimator: bbps_range Expected: 48.7672288271424, actual: 62.98211166955163, tolerance: 1 FAILED TEST!!! Input: (105, 58, 17), estimator: bbps_range Expected: 51.8512710173995, actual: 66.14312709937562, tolerance: 1 FAILED TEST!!! Input: (105, 58, 31), estimator: bbps_range Expected: 56.3579726798541, actual: 66.98049856252631, tolerance: 1 FAILED TEST!!! Input: (105, 60, 11), estimator: bbps_range Expected: 49.9786778381918, actual: 62.89118576385145, tolerance: 1 FAILED TEST!!! Input: (105, 60, 17), estimator: bbps_range Expected: 50.7458063777579, actual: 66.01175504443484, tolerance: 1 FAILED TEST!!! Input: (105, 60, 31), estimator: bbps_range Expected: 55.8270531377049, actual: 66.7992549121764, tolerance: 1 FAILED TEST!!! Input: (110, 55, 11), estimator: bbps_range Expected: 52.4071565410762, actual: 65.93678156440399, tolerance: 1 FAILED TEST!!! Input: (110, 55, 17), estimator: bbps_range Expected: 54.3478771638952, actual: 69.67722965151522, tolerance: 1 FAILED TEST!!! Input: (110, 55, 31), estimator: bbps_range Expected: 59.2978264381220, actual: 69.13296938075463, tolerance: 1 FAILED TEST!!! Input: (110, 57, 11), estimator: bbps_range Expected: 52.0478429743149, actual: 66.0742188168121, tolerance: 1 FAILED TEST!!! Input: (110, 57, 17), estimator: bbps_range Expected: 53.9474763783379, actual: 69.78945516121496, tolerance: 1 FAILED TEST!!! Input: (110, 57, 31), estimator: bbps_range Expected: 59.0606927626171, actual: 69.20824256251929, tolerance: 1 FAILED TEST!!! Input: (110, 59, 11), estimator: bbps_range Expected: 53.0235762309153, actual: 66.13381093916678, tolerance: 1 FAILED TEST!!! Input: (110, 59, 17), estimator: bbps_range Expected: 53.5844874463945, actual: 69.81904846844341, tolerance: 1 FAILED TEST!!! Input: (110, 59, 31), estimator: bbps_range Expected: 58.5793264749438, actual: 69.19196459239335, tolerance: 1 FAILED TEST!!! Input: (110, 61, 11), estimator: bbps_range Expected: 50.2123317078967, actual: 66.11644114121549, tolerance: 1 FAILED TEST!!! Input: (110, 61, 17), estimator: bbps_range Expected: 53.3223294914203, actual: 69.7666007614908, tolerance: 1 FAILED TEST!!! Input: (110, 61, 31), estimator: bbps_range Expected: 58.1859152949306, actual: 69.08430632760614, tolerance: 1 FAILED TEST!!! Input: (110, 63, 11), estimator: bbps_range Expected: 51.0710255128329, actual: 66.02301370167294, tolerance: 1 FAILED TEST!!! Input: (110, 63, 17), estimator: bbps_range Expected: 52.5937475014904, actual: 67.88510294186437, tolerance: 1 FAILED TEST!!! Input: (110, 63, 31), estimator: bbps_range Expected: 57.9262254573543, actual: 68.88528569423761, tolerance: 1 FAILED TEST!!! Input: (115, 57, 11), estimator: bbps_range Expected: 53.2218968707865, actual: 69.03211958762971, tolerance: 1 FAILED TEST!!! Input: (115, 57, 17), estimator: bbps_range Expected: 56.2199058600962, actual: 71.52962523703223, tolerance: 1 FAILED TEST!!! Input: (115, 57, 31), estimator: bbps_range Expected: 61.4625216994242, actual: 71.26148327750333, tolerance: 1 FAILED TEST!!! Input: (115, 59, 11), estimator: bbps_range Expected: 55.5748263680322, actual: 69.19129215839045, tolerance: 1 FAILED TEST!!! Input: (115, 59, 17), estimator: bbps_range Expected: 56.1468379204806, actual: 71.65440086539225, tolerance: 1 FAILED TEST!!! Input: (115, 59, 31), estimator: bbps_range Expected: 61.0453940208672, actual: 73.32248459927294, tolerance: 1 FAILED TEST!!! Input: (115, 61, 11), estimator: bbps_range Expected: 52.3298372396839, actual: 69.27668433458228, tolerance: 1 FAILED TEST!!! Input: (115, 61, 17), estimator: bbps_range Expected: 55.5156294625955, actual: 71.69893589685341, tolerance: 1 FAILED TEST!!! Input: (115, 61, 31), estimator: bbps_range Expected: 60.6848362813686, actual: 73.33574011077893, tolerance: 1 FAILED TEST!!! Input: (115, 63, 11), estimator: bbps_range Expected: 53.2480052964781, actual: 69.28915281499005, tolerance: 1 FAILED TEST!!! Input: (115, 63, 17), estimator: bbps_range Expected: 55.6576324035605, actual: 71.66372843490979, tolerance: 1 FAILED TEST!!! Input: (115, 63, 31), estimator: bbps_range Expected: 60.4173041874188, actual: 73.26294591303567, tolerance: 1 FAILED TEST!!! Input: (115, 65, 11), estimator: bbps_range Expected: 52.0632288281409, actual: 67.78063127741946, tolerance: 1 FAILED TEST!!! Input: (115, 65, 17), estimator: bbps_range Expected: 54.6369316974536, actual: 71.54921777503041, tolerance: 1 FAILED TEST!!! Input: (115, 65, 31), estimator: bbps_range Expected: 60.1941068177734, actual: 73.10421822608339, tolerance: 1 ```
Output with beullens things in ext_le UNcommented (first run) ``` FAILED TEST!!! Input: (200, 100, 17), estimator: bbps_1 Expected: 88.01232020398258, actual: 85.98059676215709, tolerance: 0.1 FAILED TEST!!! Input: (200, 100, 31), estimator: bbps_1 Expected: 97.6016190581889, actual: 95.29228537749462, tolerance: 0.1 FAILED TEST!!! Input: (100, 50, 17), estimator: bbps_range Expected: 50.5492234157521, actual: 64.15127942203596, tolerance: 1 FAILED TEST!!! Input: (100, 50, 31), estimator: bbps_range Expected: 54.9297768841915, actual: 62.916248185548454, tolerance: 1 FAILED TEST!!! Input: (100, 52, 11), estimator: bbps_range Expected: 47.7273094143296, actual: 61.20412310801501, tolerance: 1 FAILED TEST!!! Input: (100, 52, 17), estimator: bbps_range Expected: 50.1438546973854, actual: 64.26858480987197, tolerance: 1 FAILED TEST!!! Input: (100, 52, 31), estimator: bbps_range Expected: 54.8735900962522, actual: 62.993721687450474, tolerance: 1 FAILED TEST!!! Input: (100, 54, 11), estimator: bbps_range Expected: 48.4492902156340, actual: 61.26452020280075, tolerance: 1 FAILED TEST!!! Input: (100, 54, 17), estimator: bbps_range Expected: 49.7482057110599, actual: 64.29502987338061, tolerance: 1 FAILED TEST!!! Input: (100, 54, 31), estimator: bbps_range Expected: 54.7275915865044, actual: inf, tolerance: 1 FAILED TEST!!! Input: (100, 56, 11), estimator: bbps_range Expected: 47.3204667024712, actual: 61.2410428294115, tolerance: 1 FAILED TEST!!! Input: (100, 56, 17), estimator: bbps_range Expected: 49.4885208964825, actual: 64.23134146330354, tolerance: 1 FAILED TEST!!! Input: (100, 56, 31), estimator: bbps_range Expected: 54.0562693722277, actual: 64.8228575835149, tolerance: 1 FAILED TEST!!! Input: (100, 58, 11), estimator: bbps_range Expected: 46.9195479794463, actual: 61.13486999210458, tolerance: 1 FAILED TEST!!! Input: (100, 58, 17), estimator: bbps_range Expected: 48.7358506307938, actual: 62.33448729434238, tolerance: 1 FAILED TEST!!! Input: (100, 58, 31), estimator: bbps_range Expected: 53.6413586233888, actual: 64.6117650452043, tolerance: 1 FAILED TEST!!! Input: (105, 52, 17), estimator: bbps_range Expected: 52.4185199553221, actual: 66.0139155778872, tolerance: 1 FAILED TEST!!! Input: (105, 52, 31), estimator: bbps_range Expected: 57.0927422516923, actual: 66.95427828058989, tolerance: 1 FAILED TEST!!! Input: (105, 54, 11), estimator: bbps_range Expected: 49.9329126089316, actual: 64.33857762258022, tolerance: 1 FAILED TEST!!! Input: (105, 54, 17), estimator: bbps_range Expected: 52.0794634358903, actual: 66.14477029710844, tolerance: 1 FAILED TEST!!! Input: (105, 54, 31), estimator: bbps_range Expected: 57.0905897171837, actual: 67.05818567358023, tolerance: 1 FAILED TEST!!! Input: (105, 56, 17), estimator: bbps_range Expected: 51.6816549508525, actual: 66.18764062658205, tolerance: 1 FAILED TEST!!! Input: (105, 56, 31), estimator: bbps_range Expected: 56.6797880843981, actual: 67.0668392271396, tolerance: 1 FAILED TEST!!! Input: (105, 58, 11), estimator: bbps_range Expected: 48.7672288271424, actual: 62.98211166955163, tolerance: 1 FAILED TEST!!! Input: (105, 58, 17), estimator: bbps_range Expected: 51.8512710173995, actual: 66.14312709937562, tolerance: 1 FAILED TEST!!! Input: (105, 58, 31), estimator: bbps_range Expected: 56.3579726798541, actual: 66.98049856252631, tolerance: 1 FAILED TEST!!! Input: (105, 60, 11), estimator: bbps_range Expected: 49.9786778381918, actual: 62.89118576385145, tolerance: 1 FAILED TEST!!! Input: (105, 60, 17), estimator: bbps_range Expected: 50.7458063777579, actual: 66.01175504443484, tolerance: 1 FAILED TEST!!! Input: (105, 60, 31), estimator: bbps_range Expected: 55.8270531377049, actual: 66.7992549121764, tolerance: 1 FAILED TEST!!! Input: (110, 55, 11), estimator: bbps_range Expected: 52.4071565410762, actual: 65.93678156440399, tolerance: 1 FAILED TEST!!! Input: (110, 55, 17), estimator: bbps_range Expected: 54.3478771638952, actual: 69.67722965151522, tolerance: 1 FAILED TEST!!! Input: (110, 55, 31), estimator: bbps_range Expected: 59.2978264381220, actual: 69.13296938075463, tolerance: 1 FAILED TEST!!! Input: (110, 57, 11), estimator: bbps_range Expected: 52.0478429743149, actual: 66.0742188168121, tolerance: 1 FAILED TEST!!! Input: (110, 57, 17), estimator: bbps_range Expected: 53.9474763783379, actual: 69.78945516121496, tolerance: 1 FAILED TEST!!! Input: (110, 57, 31), estimator: bbps_range Expected: 59.0606927626171, actual: 69.20824256251929, tolerance: 1 FAILED TEST!!! Input: (110, 59, 11), estimator: bbps_range Expected: 53.0235762309153, actual: 66.13381093916678, tolerance: 1 FAILED TEST!!! Input: (110, 59, 17), estimator: bbps_range Expected: 53.5844874463945, actual: 69.81904846844341, tolerance: 1 FAILED TEST!!! Input: (110, 59, 31), estimator: bbps_range Expected: 58.5793264749438, actual: 69.19196459239335, tolerance: 1 FAILED TEST!!! Input: (110, 61, 11), estimator: bbps_range Expected: 50.2123317078967, actual: 66.11644114121549, tolerance: 1 FAILED TEST!!! Input: (110, 61, 17), estimator: bbps_range Expected: 53.3223294914203, actual: 69.7666007614908, tolerance: 1 FAILED TEST!!! Input: (110, 61, 31), estimator: bbps_range Expected: 58.1859152949306, actual: 69.08430632760614, tolerance: 1 FAILED TEST!!! Input: (110, 63, 11), estimator: bbps_range Expected: 51.0710255128329, actual: 66.02301370167294, tolerance: 1 FAILED TEST!!! Input: (110, 63, 17), estimator: bbps_range Expected: 52.5937475014904, actual: 67.88510294186437, tolerance: 1 FAILED TEST!!! Input: (110, 63, 31), estimator: bbps_range Expected: 57.9262254573543, actual: 68.88528569423761, tolerance: 1 FAILED TEST!!! Input: (115, 57, 11), estimator: bbps_range Expected: 53.2218968707865, actual: 69.03211958762971, tolerance: 1 FAILED TEST!!! Input: (115, 57, 17), estimator: bbps_range Expected: 56.2199058600962, actual: 71.52962523703223, tolerance: 1 FAILED TEST!!! Input: (115, 57, 31), estimator: bbps_range Expected: 61.4625216994242, actual: 71.26148327750333, tolerance: 1 FAILED TEST!!! Input: (115, 59, 11), estimator: bbps_range Expected: 55.5748263680322, actual: 69.19129215839045, tolerance: 1 FAILED TEST!!! Input: (115, 59, 17), estimator: bbps_range Expected: 56.1468379204806, actual: 71.65440086539225, tolerance: 1 FAILED TEST!!! Input: (115, 59, 31), estimator: bbps_range Expected: 61.0453940208672, actual: 73.32248459927294, tolerance: 1 FAILED TEST!!! Input: (115, 61, 11), estimator: bbps_range Expected: 52.3298372396839, actual: 69.27668433458228, tolerance: 1 FAILED TEST!!! Input: (115, 61, 17), estimator: bbps_range Expected: 55.5156294625955, actual: 71.69893589685341, tolerance: 1 FAILED TEST!!! Input: (115, 61, 31), estimator: bbps_range Expected: 60.6848362813686, actual: 73.33574011077893, tolerance: 1 FAILED TEST!!! Input: (115, 63, 11), estimator: bbps_range Expected: 53.2480052964781, actual: 69.28915281499005, tolerance: 1 FAILED TEST!!! Input: (115, 63, 17), estimator: bbps_range Expected: 55.6576324035605, actual: 71.66372843490979, tolerance: 1 FAILED TEST!!! Input: (115, 63, 31), estimator: bbps_range Expected: 60.4173041874188, actual: 73.26294591303567, tolerance: 1 FAILED TEST!!! Input: (115, 65, 11), estimator: bbps_range Expected: 52.0632288281409, actual: 69.22957539147615, tolerance: 1 FAILED TEST!!! Input: (115, 65, 17), estimator: bbps_range Expected: 54.6369316974536, actual: 71.54921777503041, tolerance: 1 FAILED TEST!!! Input: (115, 65, 31), estimator: bbps_range Expected: 60.1941068177734, actual: 73.10421822608339, tolerance: 1 ```
Output with beullens things in ext_le UNcommented (second run) ``` FAILED TEST!!! Input: (200, 100, 17), estimator: bbps_1 Expected: 88.01232020398258, actual: 85.98059676215709, tolerance: 0.1 FAILED TEST!!! Input: (200, 100, 31), estimator: bbps_1 Expected: 97.6016190581889, actual: 95.29228537749462, tolerance: 0.1 FAILED TEST!!! Input: (100, 50, 17), estimator: bbps_range Expected: 50.5492234157521, actual: 64.15127942203596, tolerance: 1 FAILED TEST!!! Input: (100, 50, 31), estimator: bbps_range Expected: 54.9297768841915, actual: 62.916248185548454, tolerance: 1 FAILED TEST!!! Input: (100, 52, 11), estimator: bbps_range Expected: 47.7273094143296, actual: 61.20412310801501, tolerance: 1 FAILED TEST!!! Input: (100, 52, 17), estimator: bbps_range Expected: 50.1438546973854, actual: 64.26858480987197, tolerance: 1 FAILED TEST!!! Input: (100, 52, 31), estimator: bbps_range Expected: 54.8735900962522, actual: 62.993721687450474, tolerance: 1 FAILED TEST!!! Input: (100, 54, 11), estimator: bbps_range Expected: 48.4492902156340, actual: 61.26452020280075, tolerance: 1 FAILED TEST!!! Input: (100, 54, 17), estimator: bbps_range Expected: 49.7482057110599, actual: 64.29502987338061, tolerance: 1 FAILED TEST!!! Input: (100, 54, 31), estimator: bbps_range Expected: 54.7275915865044, actual: 62.9695081119894, tolerance: 1 FAILED TEST!!! Input: (100, 56, 11), estimator: bbps_range Expected: 47.3204667024712, actual: 61.2410428294115, tolerance: 1 FAILED TEST!!! Input: (100, 56, 17), estimator: bbps_range Expected: 49.4885208964825, actual: 64.23134146330354, tolerance: 1 FAILED TEST!!! Input: (100, 56, 31), estimator: bbps_range Expected: 54.0562693722277, actual: 64.8228575835149, tolerance: 1 FAILED TEST!!! Input: (100, 58, 11), estimator: bbps_range Expected: 46.9195479794463, actual: 61.13486999210458, tolerance: 1 FAILED TEST!!! Input: (100, 58, 17), estimator: bbps_range Expected: 48.7358506307938, actual: 64.07819088999162, tolerance: 1 FAILED TEST!!! Input: (100, 58, 31), estimator: bbps_range Expected: 53.6413586233888, actual: 64.6117650452043, tolerance: 1 FAILED TEST!!! Input: (105, 52, 17), estimator: bbps_range Expected: 52.4185199553221, actual: 66.0139155778872, tolerance: 1 FAILED TEST!!! Input: (105, 52, 31), estimator: bbps_range Expected: 57.0927422516923, actual: 66.95427828058989, tolerance: 1 FAILED TEST!!! Input: (105, 54, 11), estimator: bbps_range Expected: 49.9329126089316, actual: 64.33857762258022, tolerance: 1 FAILED TEST!!! Input: (105, 54, 17), estimator: bbps_range Expected: 52.0794634358903, actual: 66.14477029710844, tolerance: 1 FAILED TEST!!! Input: (105, 54, 31), estimator: bbps_range Expected: 57.0905897171837, actual: 67.05818567358023, tolerance: 1 FAILED TEST!!! Input: (105, 56, 17), estimator: bbps_range Expected: 51.6816549508525, actual: 66.18764062658205, tolerance: 1 FAILED TEST!!! Input: (105, 56, 31), estimator: bbps_range Expected: 56.6797880843981, actual: 67.0668392271396, tolerance: 1 FAILED TEST!!! Input: (105, 58, 11), estimator: bbps_range Expected: 48.7672288271424, actual: 62.98211166955163, tolerance: 1 FAILED TEST!!! Input: (105, 58, 17), estimator: bbps_range Expected: 51.8512710173995, actual: 66.14312709937562, tolerance: 1 FAILED TEST!!! Input: (105, 58, 31), estimator: bbps_range Expected: 56.3579726798541, actual: 66.98049856252631, tolerance: 1 FAILED TEST!!! Input: (105, 60, 11), estimator: bbps_range Expected: 49.9786778381918, actual: 62.89118576385145, tolerance: 1 FAILED TEST!!! Input: (105, 60, 17), estimator: bbps_range Expected: 50.7458063777579, actual: 66.01175504443484, tolerance: 1 FAILED TEST!!! Input: (105, 60, 31), estimator: bbps_range Expected: 55.8270531377049, actual: 66.7992549121764, tolerance: 1 FAILED TEST!!! Input: (110, 55, 11), estimator: bbps_range Expected: 52.4071565410762, actual: 65.93678156440399, tolerance: 1 FAILED TEST!!! Input: (110, 55, 17), estimator: bbps_range Expected: 54.3478771638952, actual: 69.67722965151522, tolerance: 1 FAILED TEST!!! Input: (110, 55, 31), estimator: bbps_range Expected: 59.2978264381220, actual: 69.13296938075463, tolerance: 1 FAILED TEST!!! Input: (110, 57, 11), estimator: bbps_range Expected: 52.0478429743149, actual: 66.0742188168121, tolerance: 1 FAILED TEST!!! Input: (110, 57, 17), estimator: bbps_range Expected: 53.9474763783379, actual: 69.78945516121496, tolerance: 1 FAILED TEST!!! Input: (110, 57, 31), estimator: bbps_range Expected: 59.0606927626171, actual: 69.20824256251929, tolerance: 1 FAILED TEST!!! Input: (110, 59, 11), estimator: bbps_range Expected: 53.0235762309153, actual: 66.13381093916678, tolerance: 1 FAILED TEST!!! Input: (110, 59, 17), estimator: bbps_range Expected: 53.5844874463945, actual: 69.81904846844341, tolerance: 1 FAILED TEST!!! Input: (110, 59, 31), estimator: bbps_range Expected: 58.5793264749438, actual: 69.19196459239335, tolerance: 1 FAILED TEST!!! Input: (110, 61, 11), estimator: bbps_range Expected: 50.2123317078967, actual: 66.11644114121549, tolerance: 1 FAILED TEST!!! Input: (110, 61, 17), estimator: bbps_range Expected: 53.3223294914203, actual: 68.03724143724017, tolerance: 1 FAILED TEST!!! Input: (110, 61, 31), estimator: bbps_range Expected: 58.1859152949306, actual: 69.08430632760614, tolerance: 1 FAILED TEST!!! Input: (110, 63, 11), estimator: bbps_range Expected: 51.0710255128329, actual: 66.02301370167294, tolerance: 1 FAILED TEST!!! Input: (110, 63, 17), estimator: bbps_range Expected: 52.5937475014904, actual: 67.88510294186437, tolerance: 1 FAILED TEST!!! Input: (110, 63, 31), estimator: bbps_range Expected: 57.9262254573543, actual: 68.88528569423761, tolerance: 1 FAILED TEST!!! Input: (115, 57, 11), estimator: bbps_range Expected: 53.2218968707865, actual: 69.03211958762971, tolerance: 1 FAILED TEST!!! Input: (115, 57, 17), estimator: bbps_range Expected: 56.2199058600962, actual: 71.52962523703223, tolerance: 1 FAILED TEST!!! Input: (115, 57, 31), estimator: bbps_range Expected: 61.4625216994242, actual: 71.26148327750333, tolerance: 1 FAILED TEST!!! Input: (115, 59, 11), estimator: bbps_range Expected: 55.5748263680322, actual: 69.19129215839045, tolerance: 1 FAILED TEST!!! Input: (115, 59, 17), estimator: bbps_range Expected: 56.1468379204806, actual: 71.65440086539225, tolerance: 1 FAILED TEST!!! Input: (115, 59, 31), estimator: bbps_range Expected: 61.0453940208672, actual: 73.32248459927294, tolerance: 1 FAILED TEST!!! Input: (115, 61, 11), estimator: bbps_range Expected: 52.3298372396839, actual: 69.27668433458228, tolerance: 1 FAILED TEST!!! Input: (115, 61, 17), estimator: bbps_range Expected: 55.5156294625955, actual: 71.69893589685341, tolerance: 1 FAILED TEST!!! Input: (115, 61, 31), estimator: bbps_range Expected: 60.6848362813686, actual: 73.33574011077893, tolerance: 1 FAILED TEST!!! Input: (115, 63, 11), estimator: bbps_range Expected: 53.2480052964781, actual: 69.28915281499005, tolerance: 1 FAILED TEST!!! Input: (115, 63, 17), estimator: bbps_range Expected: 55.6576324035605, actual: 71.66372843490979, tolerance: 1 FAILED TEST!!! Input: (115, 63, 31), estimator: bbps_range Expected: 60.4173041874188, actual: 73.26294591303567, tolerance: 1 FAILED TEST!!! Input: (115, 65, 11), estimator: bbps_range Expected: 52.0632288281409, actual: 67.78063127741946, tolerance: 1 FAILED TEST!!! Input: (115, 65, 17), estimator: bbps_range Expected: 54.6369316974536, actual: 71.54921777503041, tolerance: 1 FAILED TEST!!! Input: (115, 65, 31), estimator: bbps_range Expected: 60.1941068177734, actual: 73.10421822608339, tolerance: 1 ```

Notice some things from this:

  1. The expected output is being preserved all along the three runs. That's why I said the change to cost.sage apparently works.
  2. The differences between the expected and actual complexities in both bbps_1 and bbps_range are really big (~3 for bbps_1 and ~15 for bbps_range ).
  3. If you compare the output for the first and second batch (or the second and third), in the item Input: (100, 54, 31), estimator: bbps_range You will notice that the second one has as actual output inf.

Please let me know if I'm misunderstood something about your comments, but this things looks like new problems for me.

FloydZ commented 2 weeks ago

ok had a look into the new problem. The current values in KATS.yml are correct. I checked a few values against the online estimator. This means the output of the sage script is correct and the output of new python implementation is faulty. Will check it. And the problem only affect BBPS.

FloydZ commented 2 weeks ago

ok fixxed it. There were a few typos in tests/internal_estimators/le.py calling the wrong function, wrong arguments.

Dioprz commented 2 weeks ago

Thank you so much for looking at the problem, and for the solution commit @FloydZ. I will make a cleanup and merge develop, and this will be ready to merge @Javierverbel :rocket:.

sonarcloud[bot] commented 2 weeks ago

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud