dafny-lang / dafny

Dafny is a verification-aware programming language
https://dafny.org
Other
2.87k stars 255 forks source link

Unstable test: System.NullReferenceException at Microsoft.Boogie.ReadOnlyVisitor.VisitVariable #4387

Open keyboardDrummer opened 1 year ago

keyboardDrummer commented 1 year ago

Run: https://github.com/dafny-lang/dafny/actions/runs/5760643103/job/15617339950?pr=4357

[xUnit.net 00:19:12.55]   Finished:    IntegrationTests
  Failed VSI-Benchmarks/b6.dfy [14 s]
  Error Message:
   System.Exception : Command returned non-zero exit code (134): dotnet /home/runner/work/dafny/dafny/dafny/Source/IntegrationTests/bin/Debug/net6.0/Dafny.dll /vcsCores:2 /useBaseNameForFileName /compileVerbose:0 /timeLimit:300 /showSnippets:0 /compile:0 TestFiles/LitTests/LitTest/VSI-Benchmarks/b6.dfy > /home/runner/work/dafny/dafny/dafny/Source/IntegrationTests/bin/Debug/net6.0/TestFiles/LitTests/LitTest/VSI-Benchmarks/Output/b6.dfy.tmp
Output:

Error:
Unhandled exception. System.AggregateException: One or more errors occurred. (One or more errors occurred. (Object reference not set to an instance of an object.))
 ---> System.AggregateException: One or more errors occurred. (Object reference not set to an instance of an object.)
 ---> System.NullReferenceException: Object reference not set to an instance of an object.
   at Microsoft.Boogie.ReadOnlyVisitor.VisitVariable(Variable node)
   at Microsoft.Boogie.ReadOnlyVisitor.VisitVariableSeq(List`1 variableSeq)
   at Microsoft.Boogie.ReadOnlyVisitor.VisitImplementation(Implementation node)
   at Microsoft.Boogie.Implementation.StdDispatch(StandardVisitor visitor)
   at Microsoft.Boogie.ReadOnlyVisitor.Visit(Absy node)
   at Microsoft.Boogie.ReadOnlyVisitor.VisitDeclarationList(List`1 declarationList)
   at Microsoft.Boogie.ReadOnlyVisitor.VisitProgram(Program node)
   at Microsoft.Boogie.Program.StdDispatch(StandardVisitor visitor)
   at Microsoft.Boogie.ReadOnlyVisitor.Visit(Absy node)
   at Microsoft.Boogie.BasicTypeVisitor..ctor(Absy absy)
   at Microsoft.Boogie.Checker.Target(Program prog, ProverContext ctx, Split split)
   at VC.CheckerPool.PrepareChecker(Program program, Split split, Checker checker)
   at VC.CheckerPool.FindCheckerFor(ConditionGeneration vcgen, Split split, CancellationToken cancellationToken)
   at VC.SplitAndVerifyWorker.DoWork(Int32 iteration, Split split, CancellationToken cancellationToken)
   at VC.SplitAndVerifyWorker.DoWorkForMultipleIterations(Split split, CancellationToken cancellationToken)
   at VC.SplitAndVerifyWorker.WorkUntilDone(CancellationToken cancellationToken)
   at VC.VCGen.VerifyImplementation(ImplementationRun run, VerifierCallback callback, CancellationToken cancellationToken, IObserver`1 batchCompletedObserver)
   at VC.ConditionGeneration.VerifyImplementation(ImplementationRun run, IObserver`1 batchCompletedObserver, CancellationToken cancellationToken)
   at Microsoft.Boogie.ExecutionEngine.<>c__DisplayClass47_0.<<VerifyImplementationWithoutCaching>b__0>d.MoveNext()
keyboardDrummer commented 4 months ago

The crash seems to occur because an element of node.LocVars contains a null. This could happen even if no actual null is added, if the collection is accessed while an element is being added, given that the definition of Add updates _size before adding the item:

        public void Add(T item)
        {
            _version++;
            T[] array = _items;
            int size = _size;
            if ((uint)size < (uint)array.Length)
            {
                _size = size + 1;
                array[size] = item;
            }
            else
            {
                AddWithResize(item);
            }
        }

We could add some debugging code to verify whether this is the issue, but then the next question would be, where is the concurrent access coming from? I would expect a program that is being targeted (by Microsoft.Boogie.Checker.Target) to already have finished being processed, and not having any changes be made to it.

keyboardDrummer commented 3 months ago

Next step could be for node.LocVars to use a custom IList that can detect a concurrent access and then throws an exception in the thread that's modifying the list (as opposed to the accessing thread for which we already have the stacktrace), so then we know where the modification is coming from.

Problem would still be triggering this instrumentation since the problem occurs very rarely in Dafny's CI tests. I think we might have to put the instrumentation code behind a toggleable option that we turn on in CI, and leave there for a long time.

MikaelMayer commented 3 months ago

As @keyboardDrummer referenced it, this issue reappeared recently in another test: https://github.com/dafny-lang/dafny/issues/5396