IEDB / TCRMatch

Other
26 stars 12 forks source link

Buggy results when compiled with Intel oneAPI DPC++/C++ Compiler #43

Closed raphaeltrevizani closed 1 year ago

raphaeltrevizani commented 1 year ago

Changing the compiler to icpx (Intel's compiler, previously called icc) gave me different results, most with a score greater than 1 which shouldn't happen). The results obtained with clang++ and g++ look fine and match each other.

Steps to reproduce:

Compile with Intel's compiler icpx -fopenmp -O3 src/tcrmatch.cpp -o tcrmatch-icpx

Run it: ./tcrmatch-icpx -i test.txt -t1

To compare with the expected result: Compile with g++ g++ -fopenmp -O3 src/tcrmatch.cpp -o tcrmatch-g++

Run it: ./tcrmatch-g++ -i test.txt -t1

The outputs I get are attached for each case. The input file (test.txt) contains a peptide from the database so its score is 1 in the output files. Interestingly, the result obtained by icpx gives exactly 1 for this peptide, but it gives scores with values greater than 1 to others. Tested on Ubuntu server 2018 using Intel(R) oneAPI DPC++/C++ Compiler 2023.0.0

output.g++.txt output.icpx.txt test.txt

bpeters42 commented 1 year ago

WOW! That is crazy. Definitely needs to be investigated, and it does not seem like an easy thing to follow up on. One suggestion I would have is to google for other people reporting differences in the compilers. And to make sure that the processors we are running it on are indeed the intel processors you have. Not sure how helpful this is. Please also reach out to others (including Morten) that have more actual experience than me with different compilers in this century.

acrinklaw commented 1 year ago

This most likely has to do with the optimizations the compiler makes when passing the -O3 flag. These optimizations are highly specific to Intel architecture for their platform and they give the following advice: Intel produces compilers that produce highly optimized code for their CPUs. As with all compilers, programs compiled with optimization should have their output double-checked for accuracy. If the numeric output is incorrect or lacks the desired accuracy less-aggressive compile options should be tried.

Try seeing if removing -O3 produces valid results. You could also try dropping it to something less aggressive. I had originally included the O3 parameter for g++ as I wanted to squeeze as much savings on runtime as possible. This also raises a point that end-to-end testing might be useful with a hand calculated result as a baseline 😨

raphaeltrevizani commented 1 year ago

Thanks for the input, @acrinklaw! Removing the optimization was one of the first things I tested, but without any success. I double checked it now and noticed that compiling with icc also gives different results when -O3 is used vs when -O3 is not used. When g++ is used, the results are the same for both. I suspect some uninitialized variable or I/O operations from unnalocated areas, as g++ usually deals with those pretty well.

raphaeltrevizani commented 1 year ago

I located where the bug is, but I'm not able to fix it because there doesn't seem to be anything wrong with the code. Henryk's comment in this link matches what I observed. I'll report more later after running more tests, but I'm convinced there's nothing wrong with TCRmatch. This seems to be a problem with Intel's compiler and how it handles std::array.

danielmarrama commented 1 year ago

May I suggest we add unit tests to the TCRMatch repo to capture stuff like this? You can run multiple compilers and check the expected output within GitHub.

acrinklaw commented 1 year ago

May I suggest we add unit tests to the TCRMatch repo to capture stuff like this? You can run multiple compilers and check the expected output within GitHub.

yeah I had meant to do this 2 years back but you know how things go...

I just made a branch "feature/unit_tests" that incorporates catch2 and a GitHub action to get some basic CI/CD going.

I made a dummy unit test that should fail just as an example image

I'd be more than happy to keep working on this in my free time. Just let me know if you want to coordinate @raphaeltrevizani I don't want to overstep any bounds or make the code harder for you to work on

Also I'd love to improve the code...start breaking it out in a more OOP way and making it more readable/maintainable. I did not have a fun time trying to read my old code

raphaeltrevizani commented 1 year ago

Replacing the std::array<std:array<float,20>, 20> by basic c-arrays takes care of the problem and the outputs now match when the code is compiled with icpx and g++. It is indeed a problem with Intel's compiler as we suspected. I uploaded the code with c-arrays to the c_array branch and I'm closing this issue. Thanks for the help, all!