The build is currently failing!

michaeltryby commented 7 years ago

CI is running on this repo to help us maintain quality control. The build is currently failing! Maintaining the build in a passing state should be a high priority of the project.

Closer examination of the regression tests indicate that the water quality has changed relative to benchmark version 2.0.12. I believe this may be related to code modifications made for sequential hydraulic and water quality simulation - i.e. "LemonTiger".

Do you have any preferences on how this issue should be resolved? Sequential WQ simulation is already implemented in the epanet-dev repo. Should we roll back or fix and maintain it here?

eladsal commented 7 years ago

I'd rather maintain it here if possible. On which network the build fails?

michaeltryby commented 7 years ago

It is currently failing all three networks.

michaeltryby commented 7 years ago

I guess my preference is to rollback.

samhatchett commented 7 years ago

My preference is to fix it here. I don’t consider the 3.x engine to be operational, and I have a pressing need for this feature in the 2.x engine. I’m thinking that there may be some platform differences also affecting the regression tests. I will investigate. I have a little time later this week.

eladsal commented 7 years ago

We had a thread on testing - #16 (which is closed). There we discussed ways to compare results of new versions and I think that the way the tests are done now (using a fixed threshold for all properties of 1e-6) is not what we intended. The last build failed on travis-ci since the fifth decimal point was different.

michaeltryby commented 7 years ago

@eladsal Consider familiarizing yourself with the new regression testing framework taking shape in feature-nrtest branch tools. The new regression testing framework has been designed to allow easy development of alternate comparison routines in part due to the discussion on Issue #16 that you are referring to.

For clarification, it is the build in the feature-nrtest branch that is currently failing because the WQ results are different.

eladsal commented 7 years ago

Will do. It wasn't clear to me which tests are relevant for this thread.

eladsal commented 7 years ago

I had to run my VB test code to find that the main reason the test are probably failing is the fact that MAXMSG is defined differently in the two versions. Up to the dev branch MAXMSG was 79 and in nrtest is is 255. This probably made the entire binary output file to shift. However, there are still some differences.

Funny but this what Brad was looking for :)

samhatchett commented 7 years ago

wow, it completely did not occur to me that MAXMSG would be assumed in the binary format... but of course it is.

eladsal commented 7 years ago

Well it really shouldn't since messages are not written to the binary output but the same constant (MAXMSG) is used for the title strings .. we should change that and add MAXTITLE

michaeltryby commented 7 years ago

@eladsal Thanks for looking into this. A few points of clarification. The variable ERR_MAXMSG is defined in errormanager.h as 256. The errormanager is the only part of the code using this variable. The variable MAXMSG is defined in messages.h as 53 to reflect the actual maximum message length.

The outputapi is being compiled as a completely separate unit from EPANET so there is absolutely no way for these definitions of ERR_MAXMSG or MAXMSG to affect how the binary file is being written. Therefore, I do not agree with your assessment. This is not the reason why the build is currently failing. My best guess is that the build in the feature-nrtest branch is failing because of changes made in quality.c.

I am currently working on unit tests that establish the outputapi is working correctly. I will check them in when they are ready.

samhatchett commented 7 years ago

@michaeltryby - i think @eladsal was referring to how the binary file is written in the "2.2" development version, in output.c. If the length of a string is written as 255, and read as 53 (or whatever) via the outputapi reader (which is different code), then a problem may occur.

eladsal commented 7 years ago

@samhatchett yes, that is what I meant.

@michaeltryby it is my understanding that we are comparing the binary output created by the new code of EPANET to the binary output created by the trusted version of EPANET. If the two versions writes the title strings to the binary output in different sizes they must be read accordingly.

Regarding the quality.c file, you are right. There was a bug there which I fixed over the weekend on feature-wrapper (6e2a4e1). Also see 2fb20b4 for the MAXMSG fix.

michaeltryby commented 7 years ago

@eladsal I see what you mean now. Great catch!

Yes if the number of bytes in the title section changes that would affect the calculation of the output start position.

michaeltryby commented 6 years ago

Expanded the test suite in PR #166. Current build is failing for a wider range of examples now. There are several possibilities that may be causing the tests to fail:

Current benchmark version of EPANET 2.0.12 was last compiled using Visual Studio 6. Appveyor is configured to build OWA EPANET using Visual Studio 10 (VS 6 is not available on Appveyor).
Benchmark EPANET contains bugs that are fixed in OWA EPANET. Requiring the benchmark to be updated.
Bugs have been introduced into OWA EPANET along with new features.

SWMM is currently building with a green badge on Appveyor. It would be a significant milestone for EPANET to have a green badge as well.

eladsal commented 6 years ago

Hi,

I don't know why the tests are failing but I have done some checking.

The first network which fails is 2PRVs-aquis, see here on appveyor. I compared the original 2.0.12 version with the one I compiled from the dev branch. Apart for the version numbers the two rpt files are identical including the running times (my test script managed to run the two networks on the same second) - see attached files. Also, the values for all results are exactly the same. Compilation done with SDK 7.1 which is Visual Studio 2010 Service Pack 1 (16.00.40219).

As far as I can see, the worst error, for all networks, is at the third digit after the decimal point.

2PRVs-aquis_1.txt 2PRVs-aquis_2.txt

michaeltryby commented 6 years ago

There are several possibilities that may be causing the tests to fail:

Current benchmark version of EPANET 2.0.12 was last compiled using Visual Studio 6. Appveyor is configured to build OWA EPANET using Visual Studio 10 (VS 6 is not available on Appveyor).

I recompiled EPANET 2.0.12 using Visual Studio 10 2010 and created a set of benchmarks using it (located here).

We can use it to eliminate differences in the benchmarks due to VS6 --> VS10 compiler port and solely focus on finding and fixing bugs.

samhatchett commented 6 years ago

@michaeltryby have you compared the VS6 binary files to the VS10 files?

michaeltryby commented 6 years ago

Yeah. 27 tests pass and 18 fail. Here are the results I got: receipt.txt

eladsal commented 6 years ago

I have seen many cases in the report files that one show a value of 0.00 and the other shows -0.00. Do you consider this as a failure?

michaeltryby commented 6 years ago

@eladsal Yes the way the report file comparison works that would be flagged as a failure.

michaeltryby commented 6 years ago

When I compile the current OWA/dev branch and compare it to EPANET 2.0.12 both compiled with Visual Studio 10 2010 I got 28 tests pass and 17 tests fail. receipt.txt

I was hoping to see a bigger difference. :pensive:

eladsal commented 6 years ago

@michaeltryby can you please post the output and report files for the 2PRVs-aquis network?

michaeltryby commented 6 years ago

@eladsal for what version? I am juggling three different versions on my machine.

eladsal commented 6 years ago

So you got 2012 on VS 6 (the original) and then 2012 and 2.1 on 10, right. I would start with the 2.1 version

michaeltryby commented 6 years ago

I just updated epanet-example-networks release. It currently contains the following benchmarks:

epanet-benchmark-2012 - v2.0.12 compiled with Visual Studio 6
epanet-benchmark-2012vs10 - v2.0.12 compiled with Visual Studio 10
epanet-benchmark-220dev - current snapshot of dev branch compiled with Visual Studio 10

eladsal commented 6 years ago

I meant 2.2 and not 2.1.

Just checked the 3 versions of the report file for 2PRVs-aquis using winmerge and these files are exactly the same a part for the run times and version numbers.

I see that there is a report summary being generated for this network, might that be the problem? Also, the footer is 3 lines since between the to run times lines there is a blank line.

michaeltryby commented 6 years ago

Ok I will take a look.

michaeltryby commented 6 years ago

@eladsal Looks like some files are failing because summary wasn't set to no. I am working on fixing it now. Thanks for catching that!

michaeltryby commented 6 years ago

Eliminated some false positives, but there are still differences to investigate. REF: epanet 2012 vs6 SOT: epanet 2012 vs10
32 tests passing and 13 tests failing receipt.txt

At this point I think the majority of the differences can be attributed to the compiler changing from VS6 to VS10. I will investigate more tomorrow.

eladsal commented 6 years ago

@michaeltryby I'm now looking at the 6848-68nodes network. According to the receipt.txt file there are small differences in some values, in the range of 0.0001 (absolute). I'm getting these absolute differences (which are very small): net.txt

However, as I understand it, the test which is running is using a relative diff value rtol=0.1 (the absolute tol is zero). This means that the diff between the test and base values should be larger than 10% of the base absolute value for the test to fail. Looking again at the values in the receipt.txt file (x and y arrays) I don' see these differences so I'm not sure why this is failing.

michaeltryby commented 6 years ago

@eladsal good point! I will try and run that to ground.

michaeltryby commented 6 years ago

@eladsal and @samhatchett finally got a chance to look at this. I have some good news! When I run REF: epanet 2012 vs10 SUT: epanet 220dev vs10 45 tests pass and 0 tests fail receipt.txt This means that 100% of the failing tests can be attributed to the VS6 to VS10 compiler port. I double checked my testing rig to confirm. So I'm reasonably certain that this is the case.

How would you like to proceed?

bemcdonnell commented 6 years ago

@michaeltryby and @samhatchett, if you were to skip VS2010 and port all the way to a newer version of VS, which compiler would you choose? This is going to be the same question on the SWMM side.

samhatchett commented 6 years ago

Does this also fix the Linux build regression?

eladsal commented 6 years ago

@michaeltryby good news! It was failing before, right? What was the problem?

BTW, the receipt show that the ref version is 2.2.0dev and the test one is 2.0.12

michaeltryby commented 6 years ago

@samhatchett No that does not fix the Linux build regression. @eladsal Yeah they are reversed but it doesn't make a difference. The result is the same either way.

michaeltryby commented 6 years ago

@bemcdonnell great question. We should discuss and come to a consensus decision.

OpenWaterAnalytics / EPANET

The build is currently failing! #129