verilator / verilator

Verilator open-source SystemVerilog simulator and lint system
https://verilator.org
GNU Lesser General Public License v3.0
2.58k stars 617 forks source link

V3Hashed.cpp Called isIdentical on non-hashed nodes, from Gate dedupe() #1475

Closed veripoolbot closed 5 years ago

veripoolbot commented 5 years ago

Author Name: Øyvind Harboe Original Redmine Issue: 1475 from https://www.veripool.org

Original Assignee: Wilson Snyder (@wsnyder)


I'm getting the error below in BOLD when I invoke Verilator. DressRehearsalTestBench.v is generated by Chisel: https://chisel.eecs.berkeley.edu/

I'm not sure how to proceed with this bug-report as I haven't been able to reduce DressRehearsalTestBench.v to the point where I can include it in the bug-report.

$ verilator --cc DressRehearsalTestBench.v --assert -Wno-fatal -Wno-WIDTH -Wno-STMTDLY -O1 --top-module DressRehearsalTestBench +define+TOP_TYPE=VDressRehearsalTestBench +define+PRINTF_COND=\!DressRehearsalTestBench.reset +define+STOP_COND=\!DressRehearsalTestBench.reset -CFLAGS "-Wno-undefined-bool-conversion -O1 -DTOP_TYPE=VDressRehearsalTestBench -DVL_USER_FINISH -include VDressRehearsalTestBench.h" -Mdir /[deleted]/test_run_dir/dressrehearsal -f /[deleted]/dressrehearsal/black_box_verilog_files.f --exe /[deleted]/test_run_dir/dressrehearsal/DressRehearsalTestBench-harness.cpp --trace
%Error: Internal Error: DressRehearsalTestBench.v:18956: *../V3Hashed.cpp:134: Called isIdentical on non-hashed nodes*
%Error: Internal Error: See the manual and http://www.veripool.org/verilator for more assistance.
%Error: Command Failed /usr/local/bin/verilator_bin --cc DressRehearsalTestBench.v --assert -Wno-fatal -Wno-WIDTH -Wno-STMTDLY -O1 --top-module DressRehearsalTestBench \+define\+TOP_TYPE\=VDressRehearsalTestBench \+define\+PRINTF_COND\=\!DressRehearsalTestBench.reset \+define\+STOP_COND\=\!DressRehearsalTestBench.reset -CFLAGS -Wno-undefined-bool-conversion\ -O1\ -DTOP_TYPE\=VDressRehearsalTestBench\ -DVL_USER_FINISH\ -include\ VDressRehearsalTestBench.h -Mdir /[deleted]/test_run_dir/dressrehearsal -f /[deleted]//test_run_dir/dressrehearsal/black_box_verilog_files.f --exe /[deleted]/test_run_dir/dressrehearsal/DressRehearsalTestBench-harness.cpp --trace
veripoolbot commented 5 years ago

Original Redmine Comment Author Name: Øyvind Harboe Original Date: 2019-06-27T18:21:11Z


This is with 4.016 as well as 4.014:

$ verilator --version Verilator 4.014 2019-05-08 rev UNKNOWN_REV

veripoolbot commented 5 years ago

Original Redmine Comment Author Name: Wilson Snyder (@wsnyder) Original Date: 2019-06-27T22:04:15Z


There was recently a similar error resulting from parameters, are there parameters in DressRehersalTestBench.v?

What is near DressRehersalTestBench.v line 18956?

If reduction of code makes the bug go away, try the -Oi flag to disable inlining. (It might be the smaller code then gets inlined, hiding the bug.)

Try using --debug. Let us know maybe the last 10 lines of the output.

veripoolbot commented 5 years ago

Original Redmine Comment Author Name: Øyvind Harboe Original Date: 2019-06-27T22:13:40Z


     if (_T_40457) begin
       routeOut_1 <= io_leArray_routeOut_1;
     end -- line 18956
veripoolbot commented 5 years ago

Original Redmine Comment Author Name: Øyvind Harboe Original Date: 2019-06-27T22:17:00Z


The parameterization happens in Chisel, so Chisel spits out unparameterized files, so the only parameterization I have is in a blackbox inferred RAM Verilog file. It's nowhere near the failure location.

veripoolbot commented 5 years ago

Original Redmine Comment Author Name: Øyvind Harboe Original Date: 2019-06-27T22:21:37Z


This is the only other file in the design in addition to DressRehearsalTestBench.v

veripoolbot commented 5 years ago

Original Redmine Comment Author Name: Øyvind Harboe Original Date: 2019-06-27T22:42:51Z


I was able to run a bisection to find the first failing version of Verilator:

$ git bisect bad
597d28b505ec16f571685347a0050b279ad5781f is the first bad commit
commit 597d28b505ec16f571685347a0050b279ad5781f
Author: Wilson Snyder <wsnyder@wsnyder.org>
Date:   Thu Feb 1 21:32:58 2018 -0500

     Fix internals to make null-pointer-check clean. Also add more const's. No functional change intended, but likely something will break.

:100644 100644 13dc1c06c4f317112f803c1d862fb3be413cf9fc cf9ec6f382c4290fc0014026d2d60edd44754dee M      Changes
:100644 100644 bdab2c727715268b14c46d18df3d11d125868b20 e6e079c4417b9fd6c7c195a29761ee9440faecfe M      configure.ac
:040000 040000 15ec2fa9ddf78dd976b305ce718e9efac2a97d83 4af235d50eb572bd88ce50a2e04ecafcf13bca4b M      src
veripoolbot commented 5 years ago

Original Redmine Comment Author Name: Wilson Snyder (@wsnyder) Original Date: 2019-06-27T22:54:24Z


Bisecting was a good idea, unfortunately that's was a hyper large change, though I wouldn't have expected those edits to result in this error.

What does "verilator --debug --gdbbt" give as a backtrace?

veripoolbot commented 5 years ago

Original Redmine Comment Author Name: Øyvind Harboe Original Date: 2019-06-27T23:28:58Z


Is it possible to split that commit into a sequence of commits, purely for the purpose of bisecting it further or is it necessarily a single big commit?

veripoolbot commented 5 years ago

Original Redmine Comment Author Name: Øyvind Harboe Original Date: 2019-06-27T23:34:31Z


Here's the last lines when running with "--debug --gdbgt":

- V3Ast.cpp:1050:     Dumping /home/[deleted]/test_run_dir/dressrehearsal/VDressRehearsalTestBench_083_subst.tree
- V3Const.cpp:2625:   constifyCpp: 
- V3Ast.cpp:1050:     Dumping /home/[deleted]/test_run_dir/dressrehearsal/VDressRehearsalTestBench_084_const_cpp.tree
- V3Dead.cpp:451:     deadifyAll: 
- V3Ast.cpp:1050:     Dumping /home/[deleted]/test_run_dir/dressrehearsal/VDressRehearsalTestBench_085_deadAll.tree
- V3Reloop.cpp:260:   reloopAll: 
- V3Depth.cpp:177:    depthAll: 
- V3Branch.cpp:138:   branchAll: 
- V3Cast.cpp:184:     castAll: 
- V3Ast.cpp:1050:     Dumping /home/[deleted]/test_run_dir/dressrehearsal/VDressRehearsalTestBench_088_cast.tree
- V3CCtors.cpp:148:   cctorsAll: 
- V3EmitCInlines.cpp:94:emitcInlines: 
- V3EmitCSyms.cpp:664:emitcSyms: 
- V3EmitC.cpp:3207:   emitcTrace: 
- V3EmitC.cpp:3193:   emitc: 
- V3EmitXml.cpp:334:  emitxml: 
- V3StatsReport.cpp:241:statsReport: 
- V3EmitMk.cpp:247:   emitmk: 
- V3Os.cpp:57:        export VERILATOR_ROOT=/usr/local/share/verilator # Hardcoded at build time
- V3Ast.cpp:1050:     Dumping /home/[deleted]/test_run_dir/dressrehearsal/VDressRehearsalTestBench_990_final.tree
- Verilator.cpp:664:  Done, Exiting...
[Inferior 1 (process 9814) exited normally]
No stack.
veripoolbot commented 5 years ago

Original Redmine Comment Author Name: Wilson Snyder (@wsnyder) Original Date: 2019-06-28T00:06:53Z


The logfile you attached showed no errors. Perhaps you're still in the passing bisect?

veripoolbot commented 5 years ago

Original Redmine Comment Author Name: Øyvind Harboe Original Date: 2019-06-28T07:36:23Z


I'm afraid that I'm only seeing the crash without "--debug --gdbgt", I checked.

veripoolbot commented 5 years ago

Original Redmine Comment Author Name: Øyvind Harboe Original Date: 2019-06-28T07:36:56Z


I tried with valgrind, the crash was reproducible, but it didn't report anything.

veripoolbot commented 5 years ago

Original Redmine Comment Author Name: Wilson Snyder (@wsnyder) Original Date: 2019-06-28T09:59:53Z


Presumably it's --debug that is hiding the issue as the code compiles differently.

Try --debugi 3. Also edit line 135 of V3Hashed.cpp with attached patch.

veripoolbot commented 5 years ago

Original Redmine Comment Author Name: Øyvind Harboe Original Date: 2019-06-28T10:33:36Z


Output w/patch:

FIXME1: VARREF 0x56443a087580 <e2076892#> {e18755} u4=0x1651682 @dt=0x5644384fc430@(G/w1)  __PVT__leArrayScanChain_io_leArray_enable [RV] <- VARSCOPE 0x56443d41a5a0 <e1041076> {e27500} u1=0x56443aa6a9d0 u2=0x1 @dt=0x5644384fc430@(G/w1)  TOP.DressRehearsalTestBench.aptos->__PVT__leArrayScanChain_io_leArray_enable -> VAR 0x564439ea87e0 <e621691> {e27500} @dt=0x5644384fc430@(G/w1)  __PVT__leArrayScanChain_io_leArray_enable WIRE
FIXME2: VARREF 0x564439c1b350 <e2077116#> {e18749} @dt=0x5644384fc430@(G/w1)  __PVT__leArrayScanChain_io_leArray_enable [RV] <- VARSCOPE 0x56443d41a5a0 <e1041076> {e27500} u1=0x56443aa6a9d0 u2=0x1 @dt=0x5644384fc430@(G/w1)  TOP.DressRehearsalTestBench.aptos->__PVT__leArrayScanChain_io_leArray_enable -> VAR 0x564439ea87e0 <e621691> {e27500} @dt=0x5644384fc430@(G/w1)  __PVT__leArrayScanChain_io_leArray_enable WIRE
%Error: Internal Error: fail.v:18749: ../V3Hashed.cpp:143: Called isIdentical on non-hashed nodes
                         ... See the manual and http://www.veripool.org/verilator for more assistance.

Output with "--debugi 3":

- V3Active.cpp:440:   activeAll: 
- V3Ast.cpp:1050:     Dumping /home/oyvind/ascenium/aptos/test_run_dir/dressrehearsal/VDressRehearsalTestBench_045_active.tree
- V3Split.cpp:1015:   splitAlwaysAll: 
- V3Ast.cpp:1050:     Dumping /home/oyvind/ascenium/aptos/test_run_dir/dressrehearsal/VDressRehearsalTestBench_046_split.tree
- V3SplitAs.cpp:219:  splitAsAll: 
- V3Ast.cpp:1050:     Dumping /home/oyvind/ascenium/aptos/test_run_dir/dressrehearsal/VDressRehearsalTestBench_047_splitas.tree
- V3TraceDecl.cpp:333:traceDeclAll: 
- V3Ast.cpp:1050:     Dumping /home/oyvind/ascenium/aptos/test_run_dir/dressrehearsal/VDressRehearsalTestBench_048_tracedecl.tree
- V3Gate.cpp:1540:    gateAll: 
dot -Tpdf -o ~/a.pdf /home/oyvind/ascenium/aptos/test_run_dir/dressrehearsal/VDressRehearsalTestBench_049_gate_simp.dot
FIXME1: VARREF 0x560c8d9408b0 <e2076892#> {e18755} u4=0x1aa5e3a @dt=0x560c8c948c20@(G/w1)  __PVT__leArrayScanChain_io_leArray_enable [RV] <- VARSCOPE 0x560c9195dcd0 <e1041076> {e27500} u1=0x560c8f10b780 u2=0x1 @dt=0x560c8c948c20@(G/w1)  TOP.DressRehearsalTestBench.aptos->__PVT__leArrayScanChain_io_leArray_enable -> VAR 0x560c8e2f5480 <e621691> {e27500} @dt=0x560c8c948c20@(G/w1)  __PVT__leArrayScanChain_io_leArray_enable WIRE
FIXME2: VARREF 0x560c8d8ed170 <e2077116#> {e18749} @dt=0x560c8c948c20@(G/w1)  __PVT__leArrayScanChain_io_leArray_enable [RV] <- VARSCOPE 0x560c9195dcd0 <e1041076> {e27500} u1=0x560c8f10b780 u2=0x1 @dt=0x560c8c948c20@(G/w1)  TOP.DressRehearsalTestBench.aptos->__PVT__leArrayScanChain_io_leArray_enable -> VAR 0x560c8e2f5480 <e621691> {e27500} @dt=0x560c8c948c20@(G/w1)  __PVT__leArrayScanChain_io_leArray_enable WIRE
%Error: Internal Error: fail.v:18749: ../V3Hashed.cpp:143: Called isIdentical on non-hashed nodes
-node: VARREF 0x560c8d8ed170 <e2077116#> {e18749} @dt=0x560c8c948c20@(G/w1)  __PVT__leArrayScanChain_io_leArray_enable [RV] <- VARSCOPE 0x560c9195dcd0 <e1041076> {e27500} u1=0x560c8f10b780 u2=0x1 @dt=0x560c8c948c20@(G/w1)  TOP.DressRehearsalTestBench.aptos->__PVT__leArrayScanChain_io_leArray_enable -> VAR 0x560c8e2f5480 <e621691> {e27500} @dt=0x560c8c948c20@(G/w1)  __PVT__leArrayScanChain_io_leArray_enable WIRE
                         ... See the manual and http://www.veripool.org/verilator for more assistance.
- V3Ast.cpp:1050:     Dumping /home/oyvind/ascenium/aptos/test_run_dir/dressrehearsal/VDressRehearsalTestBench_990_final.tree
- V3StatsReport.cpp:241:statsReport: 
%Error: Internal Error: Aborting since under --debug
%Error: Verilator aborted.  Consider trying --debug --gdbbt
%Error: Command Failed /usr/local/bin/verilator_bin --debugi 3 --cc fail.v --assert -Wno-fatal -Wno-WIDTH -Wno-STMTDLY -O1 --top-module DressRehearsalTestBench \+define\+TOP_TYPE\=VDressRehearsalTestBench \+define\+PRINTF_COND\=\!DressRehearsalTestBench.reset \+define\+STOP_COND\=\!DressRehearsalTestBench.reset -CFLAGS -Wno-undefined-bool-conversion\ -O1\ -DTOP_TYPE\=VDressRehearsalTestBench\ -DVL_USER_FINISH\ -include\ VDressRehearsalTestBench.h -Mdir /home/oyvind/ascenium/aptos/test_run_dir/dressrehearsal -f /home/oyvind/ascenium/aptos/test_run_dir/dressrehearsal/black_box_verilog_files.f --exe /home/oyvind/ascenium/aptos/test_run_dir/dressrehearsal/DressRehearsalTestBench-harness.cpp --trace
veripoolbot commented 5 years ago

Original Redmine Comment Author Name: Wilson Snyder (@wsnyder) Original Date: 2019-06-28T16:45:18Z


That helps, but not enough to suggest what to fix so still think we need to get to a testcase. Basically, Verilator suspects that leArrayScanChain_io_leArray_enable feeds to some logic that is probably duplicated and can be simplified, and is getting this error trying to prove it.

Please look to see if you can make a test case that involves the signal leArrayScanChain_io_leArray_enable (probably feeding to _T_40457). Another method if you haven't tried it is to see it fail, rip out some code and if still fails keep ripping. Backup if it starts passing.

veripoolbot commented 5 years ago

Original Redmine Comment Author Name: Øyvind Harboe Original Date: 2019-06-28T17:10:17Z


I've tried to reduce it along the lines you describe and I've failed. I'm skeptical of this apparoch because simply specifying debug output is enough to make the problem go away, which I assume, means that there's some sort of dangling reference or uninitialize data. Also, I've had a great many variations of this dress rehearsal test and it's only the exact version that I have now that gives Verilator constipation.

The failing Verilog is 2.2mBytes uncompressed and 37000 lines of code. It's a scaled down version of what I'm building, which is going to be ~500k lines of Verilog.

I can investigate the possibility of emailing you the Verilog file privately and then you can use it to debug the problem, if possible extract a test-case once you understand the problem to put into a test-suite and then delete the Verilog afterwards.

Would that be a meaningful way forward?

veripoolbot commented 5 years ago

Original Redmine Comment Author Name: Wilson Snyder (@wsnyder) Original Date: 2019-07-22T12:47:42Z


Got a test case which shows a replacement is going into a flop which itself is then subject to replacement. This is inside the "Gate dedupe() outputs" stage of optimization.

I have ugly code which can detect this case and disable further optimizations on the substituted block, but would prefer to commit something that can continue optimization, unfortunately this requires rework of the data structures.

As a temp workaround please try commenting the block of code after printing "Gate dedupe() outputs" in V3Gate.cpp.

veripoolbot commented 5 years ago

Original Redmine Comment Author Name: Wilson Snyder (@wsnyder) Original Date: 2019-08-04T02:01:47Z


Fixed in git towards 4.017. Thanks for the test case & patience.

veripoolbot commented 5 years ago

Original Redmine Comment Author Name: Øyvind Harboe Original Date: 2019-08-04T10:10:02Z


Thanks for fixing this. This must have been an obscure edge case, I only ran into it that once. Given the Verilator test-suite size, I would expect new tests to be more and more obscure. I've been running with quite an old version of Verilator for a year or so, and when I was finally prompted to upgrade, this was the only snag.

I'm glad it's in the automated test-suite now.

veripoolbot commented 5 years ago

Original Redmine Comment Author Name: Wilson Snyder (@wsnyder) Original Date: 2019-08-29T23:14:45Z


In 4.018.