vprover / vampire

The Vampire Theorem Prover
https://vprover.github.io/
Other
283 stars 49 forks source link

UWA test fails on linux build #590

Open Angular-Angel opened 2 weeks ago

Angular-Angel commented 2 weeks ago

I built vampire, and was trying to run it on the intro problems from ASE 2017, and it failed on the Fahrenheit to Celsius conversion - so I built a dev version and ran the tests, and sure enough, the UWA test failed. I had the ASE into problem fail with both the z3 and non z3 versions, but I only tested the z3 version of vampire.

98% tests passed, 1 tests failed out of 40

Total Test time (real) =  23.34 sec

The following tests FAILED:
      3 - UnificationWithAbstraction (Failed)
Errors while running CTest

UWATestErrors.txt

MichaelRawson commented 2 weeks ago

Interesting. We have a build script run on pull requests to do a Z3 build on a Linux server and run the unit tests, so this shouldn't happen. Any unusual configuration - hardware, Z3 version, etc? I'll do a CI run on master now to check.

MichaelRawson commented 2 weeks ago

https://github.com/vprover/vampire/actions/runs/10611771902

Angular-Angel commented 2 weeks ago

Shouldn't be anything unusual. I just built the thing using the plain instructions, as provided, on a fairly fresh install of Xubuntu. Might be I have a package somewhere that's on a newer version than vampire expects, or something?

angle@angle-machine:~/Desktop$ neofetch
           `-/osyhddddhyso/-`              angle@angle-machine 
        .+yddddddddddddddddddy+.           ------------------- 
      :yddddddddddddddddddddddddy:         OS: Xubuntu 24.04 LTS x86_64 
    -yddddddddddddddddddddhdddddddy-       Kernel: 6.8.0-41-generic 
   odddddddddddyshdddddddh`dddd+ydddo      Uptime: 5 hours, 5 mins 
 `yddddddhshdd-   ydddddd+`ddh.:dddddy`    Packages: 1996 (dpkg), 13 (snap) 
 sddddddy   /d.   :dddddd-:dy`-ddddddds    Shell: bash 5.2.21 
:ddddddds    /+   .dddddd`yy`:ddddddddd:   Resolution: 1920x1080 
sdddddddd`    .    .-:/+ssdyodddddddddds   DE: Xfce 4.18 
ddddddddy                  `:ohddddddddd   WM: Xfwm4 
dddddddd.                      +dddddddd   WM Theme: Greybird 
sddddddy                        ydddddds   Theme: Greybird [GTK2/3] 
:dddddd+                      .oddddddd:   Icons: elementary-xfce-dark [GTK2/3] 
 sdddddo                   ./ydddddddds    Terminal: xfce4-terminal 
 `yddddd.              `:ohddddddddddy`    Terminal Font: Monospace 12 
   oddddh/`      `.:+shdddddddddddddo      CPU: Intel i5-6500 (4) @ 3.600GHz 
    -ydddddhyssyhdddddddddddddddddy-       GPU: NVIDIA GeForce GTX 1050 Ti 
      :yddddddddddddddddddddddddy:         Memory: 2968MiB / 23990MiB 
        .+yddddddddddddddddddy+.
           `-/osyhddddhyso/-`                                      
MichaelRawson commented 2 weeks ago

Thanks for the details. That's really odd, you're slap-bang in the middle of what we test on. Could you run vampire --version for me?

MichaelRawson commented 2 weeks ago

If anything you're on the newer side: maybe with 24.04 they're shipping a new compiler or something that's breaking us. I sure hope not, though.

Angular-Angel commented 2 weeks ago
angle@angle-machine:~/Documents/Programming/C++/vampire$ ./build/bin/vampire_z3_rel_master_8371 --version
Vampire 4.9 (commit e77b9e52b on 2024-08-19 11:35:10 +0200)
Linked with Z3 4.12.3.0 79bbbf76d0c123481c8ca05cd3a98939270074d3 z3-4.8.4-7980-g79bbbf76d
% VDEBUG=0
% subsat features: VMTF VMTF_BUMP LIMITS

It's possible? It seems weird that only this one thing would break, but I don't know my C++ that well.

Angular-Angel commented 2 weeks ago

Also see the text file in the first post - it has the terminal output for the failures, including a bunch of errors that popped up.

Angular-Angel commented 2 weeks ago

And in case it's relevant, my GCC version:

gcc (Ubuntu 13.2.0-23ubuntu4) 13.2.0
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
quickbeam123 commented 2 weeks ago

@joe-hauns , would be able to glean what could be going on? (Especially from the UWATestErrors.txt mentioned earlier?)

joe-hauns commented 1 week ago

I'm trying to reproduce the bug. Tried on macos with gcc 13.3 (13.2 is not available via a my package manager), and the bug doesn't show up. @Angular-Angel can you maybe try updating to 13.3? I'm also building the current version of gcc 13 on the cluster rn, and will see whether i can reproduce the bug there.

From the error logs i cannot tell what the issue is, but I can see that it's only triggered for higher order problems, which our current master isn't super stable for anyways. So if you're running on non-higher order there should probably not be any issues. not sure whether that helps.. If you wanna run on higher order problems there is a different branch than master, @ibnyusuf probably knows which one was the last to enter TPTP...?

joe-hauns commented 1 week ago

I also don't run into any issues when running gcc 13.3.1 (built from source) on linux.

Angular-Angel commented 1 week ago

Hmm. Lemme try and see if I can upgrade gcc, then. :/

ibnyusuf commented 1 week ago

I am not surprised that we are hitting some issues with higher-order reasoning on master branch. As @joe-hauns mentioned, higher-order on current master is not very well supported.

@Angular-Angel , if you are interested in running on higher-order problems, please use this branch for the time being.

We are in the process of merging it into master, but for the time being it is the most stable branch to support higher-order I think.

ibnyusuf commented 1 week ago

@Angular-Angel could you tell me a little more about the "intro problems from ASE 2017"? How can this problem set be accessed?

Angular-Angel commented 1 week ago

It's right here!

https://github.com/vprover/ase17tutorial/tree/master/intro

ibnyusuf commented 1 week ago

It's right here!

https://github.com/vprover/ase17tutorial/tree/master/intro

Somehow I wasn't even aware of these! These are certainly not higher-order, so we shoudn't be failing on them.

Angular-Angel commented 1 week ago

This is the one it fails on - I think there's some higher order logic? :/

https://github.com/vprover/ase17tutorial/blob/master/intro/MSC023%3D2.p