rdaly525 / coreir

BSD 3-Clause "New" or "Revised" License
100 stars 24 forks source link

Segfault in verilog generation for camera pipeline #932

Closed dillonhuff closed 4 years ago

dillonhuff commented 4 years ago

@rdaly525 I'm generating coreir for the cgra implementation of camera pipeline (Jeff gave me basic versions of the compute units), but when I try to create verilog for the coreir implementation to test it before PnR I get an error:

dhuff@kiwi:~/clockwork$ ${COREIR_PATH}/bin/coreir --load_libs commonlib --input camera_pipeline.json --output camera_pipeline.v
Aborted (core dumped)

The error backtrace is:

(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007ffff65868b1 in __GI_abort () at abort.c:79
#2  0x00007ffff799bb75 in CoreIR::convert_value(CoreIR::Value*) () from /home/dhuff/clockwork/lib/libcoreir.so
#3  0x00007ffff79a2c0d in CoreIR::compile_module_body(CoreIR::RecordType*, CoreIR::ModuleDef*, bool, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&) () from /home/dhuff/clockwork/lib/libcoreir.so
#4  0x00007ffff79a5991 in CoreIR::Passes::Verilog::compileModule(CoreIR::Module*) ()
   from /home/dhuff/clockwork/lib/libcoreir.so
#5  0x00007ffff79a6515 in CoreIR::Passes::Verilog::runOnInstanceGraphNode(CoreIR::InstanceGraphNode&) ()
   from /home/dhuff/clockwork/lib/libcoreir.so
#6  0x00007ffff7879d91 in CoreIR::PassManager::runInstanceGraphPass(CoreIR::Pass*) ()
   from /home/dhuff/clockwork/lib/libcoreir.so
#7  0x00007ffff787a1c0 in CoreIR::PassManager::runPass(CoreIR::Pass*, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&) () from /home/dhuff/clockwork/lib/libcoreir.so
#8  0x00007ffff787b162 in CoreIR::PassManager::run(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >) () from /home/dhuff/clockwork/lib/libcoreir.so
#9  0x00007ffff77b9f46 in CoreIR::Context::runPasses(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >) () from /home/dhuff/clockwork/lib/libcoreir.so
#10 0x000055555568111c in main ()

And the camera_pipeline.json file is here: https://github.com/dillonhuff/clockwork/tree/master/coreir_compute/defective

LMK if I'm doing something wrong.

rdaly525 commented 4 years ago

Confirmed, this is causing a segfault. No obvious issue with the command nor file. @leonardt looks like it is segfaulting on the verilog coreir pass. Do you mind taking a look?

rsetaluri commented 4 years ago

@dillonhuff i'm trying to repro and get seeing a different segfault actually. What branch/commit of coreir are you on? (also I'm on macos not linux)

dillonhuff commented 4 years ago

@rsetaluri I'm on branch master at:

dhuff@kiwi:~/clockwork/coreir$ git log -1
commit f9190bcfb6ce2121b6af8ecf3448a6e539a0fc86 (HEAD -> master, origin/master, origin/HEAD)
Author: Leonard (Lenny) Truong <lenny@stanford.edu>
Date:   Wed Jun 10 15:20:41 2020 -0700

    Add fix for inlining slices (#897)

    * Add fix for inlining slices

    * use sortedconnections for deterministic output

    * Revert "use sortedconnections for deterministic output"

    This reverts commit 9dfb53f87f55b15cee4bd7a0a2dd852e3253e981.

    * Revert "Revert "use sortedconnections for deterministic output""

    This reverts commit 5104e5eb548a207068308ee309c45ed0ae0dabb4.

    * Sort connection members

    * Add cacheing logic/test

    * Move slice logic into PTTraverse

    * Remove extraneous comments

    * Minor style

    * Update wire inlining logic to handle mantle wires
rsetaluri commented 4 years ago

Very weird, I'm unable to build coreir (from a clean repo checked out at f9190bcfb6ce2121b6af8ecf3448a6e539a0fc86).

In file included from /tmp/coreir/src/passes/analysis/verilog.cpp:1:
In file included from /tmp/coreir/src/../include/coreir/passes/analysis/verilog.h:4:
/Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/memory:3131:32: error: no matching constructor for initialization of 'verilogAST::Index'
    return unique_ptr<_Tp>(new _Tp(_VSTD::forward<_Args>(__args)...));
                               ^   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/coreir/src/passes/analysis/verilog.cpp:710:19: note: in instantiation of function template specialization 'std::__1::make_unique<verilogAST::Index,
      std::__1::variant<std::__1::unique_ptr<verilogAST::Identifier, std::__1::default_delete<verilogAST::Identifier> >,
      std::__1::unique_ptr<verilogAST::Attribute, std::__1::default_delete<verilogAST::Attribute> >, std::__1::unique_ptr<verilogAST::Slice,
      std::__1::default_delete<verilogAST::Slice> > >, std::__1::unique_ptr<verilogAST::NumericLiteral,
      std::__1::default_delete<verilogAST::NumericLiteral> > >' requested here
      return std::make_unique<vAST::Index>(
                  ^
/tmp/coreir/build/verilogAST-src/include/verilogAST.hpp:236:3: note: candidate constructor not viable: no known conversion from
      'variant<[3 * ...], (no argument)>' to 'variant<[3 * ...], std::unique_ptr<Index>>' for 1st argument
  Index(std::variant<std::unique_ptr<Identifier>, std::unique_ptr<Attribute>,
  ^
/tmp/coreir/build/verilogAST-src/include/verilogAST.hpp:242:3: note: candidate constructor not viable: requires single argument 'rhs', but 2 arguments
      were provided
  Index(const Index& rhs)
  ^
1 error generated.
make[2]: *** [src/CMakeFiles/coreir.dir/passes/analysis/verilog.cpp.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [src/CMakeFiles/coreir.dir/all] Error 2
make: *** [all] Error 2
rsetaluri commented 4 years ago

Let me move to unix and try there.

rsetaluri commented 4 years ago

Yeah this seems to be a broken commit of coreir. Do you have any local changes/fixes?

dillonhuff commented 4 years ago

@rsetaluri no, is this already fixed in a more recent commit?

rsetaluri commented 4 years ago

presumably, latest master is building but has a different segfault backtrace:

Program received signal SIGSEGV, Segmentation fault.
0x000055555569c1c3 in nlohmann::basic_json<std::map, std::vector, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, unsigned long, double, std::allocator, nlohmann::adl_serializer>::basic_json(nlohmann::basic_json<std::map, std::vector, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, unsigned long, double, std::allocator, nlohmann::adl_serializer> const&) ()
(gdb) bt
#0  0x000055555569c1c3 in nlohmann::basic_json<std::map, std::vector, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, unsigned long, double, std::allocator, nlohmann::adl_serializer>::basic_json(nlohmann::basic_json<std::map, std::vector, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, unsigned long, double, std::allocator, nlohmann::adl_serializer> const&) ()
#1  0x00007ffff796f9a2 in CoreIR::convert_mem_init_param_value(CoreIR::Value*, int) () from /home/setaluri/coreir/build/lib/libcoreir.so
#2  0x00007ffff7979551 in CoreIR::Passes::Verilog::compileModuleBody(CoreIR::RecordType*, CoreIR::ModuleDef*, bool, bool, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&) ()
   from /home/setaluri/coreir/build/lib/libcoreir.so
#3  0x00007ffff797c66a in CoreIR::Passes::Verilog::compileModule(CoreIR::Module*) () from /home/setaluri/coreir/build/lib/libcoreir.so
#4  0x00007ffff797d3d2 in CoreIR::Passes::Verilog::runOnInstanceGraphNode(CoreIR::InstanceGraphNode&) ()
   from /home/setaluri/coreir/build/lib/libcoreir.so
#5  0x00007ffff784e6e7 in CoreIR::PassManager::runInstanceGraphPass(CoreIR::Pass*) () from /home/setaluri/coreir/build/lib/libcoreir.so
#6  0x00007ffff784eb16 in CoreIR::PassManager::runPass(CoreIR::Pass*, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&) ()
   from /home/setaluri/coreir/build/lib/libcoreir.so
#7  0x00007ffff784fab8 in CoreIR::PassManager::run(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >) ()
   from /home/setaluri/coreir/build/lib/libcoreir.so
#8  0x00007ffff778f3b0 in CoreIR::Context::runPasses(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >) ()
   from /home/setaluri/coreir/build/lib/libcoreir.so
#9  0x0000555555682f27 in main ()
dillonhuff commented 4 years ago

@dillonhuff so it still segfaults on the latest master but with a different failure?

dillonhuff commented 4 years ago

@rsetaluri just tried it and I see that segfault.

rsetaluri commented 4 years ago

Yes that was correct. Ok that is good. Btw, where did this json come from? I want to make sure the format is compatible (unfortunately backwards compatibility is not robustly enforced for CoreIR json).

dillonhuff commented 4 years ago

@rsetaluri that file came from the earlier version of coreir. I just regenerated coreir for the app and got the same problem:

(gdb) bt
#0  0x000055555569c1c3 in nlohmann::basic_json<std::map, std::vector, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, unsigned long, double, std::allocator, nlohmann::adl_serializer>::basic_json(nlohmann::basic_json<std::map, std::vector, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, unsigned long, double, std::allocator, nlohmann::adl_serializer> const&) ()
#1  0x00007ffff796f9b2 in CoreIR::convert_mem_init_param_value(CoreIR::Value*, int) ()
   from /home/dhuff/clockwork/lib/libcoreir.so
#2  0x00007ffff7979561 in CoreIR::Passes::Verilog::compileModuleBody(CoreIR::RecordType*, CoreIR::ModuleDef*, bool, bool, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&) () from /home/dhuff/clockwork/lib/libcoreir.so
#3  0x00007ffff797c67a in CoreIR::Passes::Verilog::compileModule(CoreIR::Module*) ()
   from /home/dhuff/clockwork/lib/libcoreir.so
#4  0x00007ffff797d3e2 in CoreIR::Passes::Verilog::runOnInstanceGraphNode(CoreIR::InstanceGraphNode&) ()
   from /home/dhuff/clockwork/lib/libcoreir.so
#5  0x00007ffff784e6f7 in CoreIR::PassManager::runInstanceGraphPass(CoreIR::Pass*) ()
   from /home/dhuff/clockwork/lib/libcoreir.so
#6  0x00007ffff784eb26 in CoreIR::PassManager::runPass(CoreIR::Pass*, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&) () from /home/dhuff/clockwork/lib/libcoreir.so
#7  0x00007ffff784fac8 in CoreIR::PassManager::run(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >) () from /home/dhuff/clockwork/lib/libcoreir.so
#8  0x00007ffff778f3c0 in CoreIR::Context::runPasses(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >) () from /home/dhuff/clockwork/lib/libcoreir.so
#9  0x0000555555682f27 in main ()
dillonhuff commented 4 years ago

Here is the new version: https://github.com/dillonhuff/clockwork/blob/master/coreir_compute/defective/camera_pipeline_dse1.json

rsetaluri commented 4 years ago

Got it. I'll debug this.

rsetaluri commented 4 years ago

Update: after some digging, looks like the issue is that we are trying to use a BitVector for coreir.mem "init" genarg. Unfortunately I wasn't able to track down exactly which instance it was coming from, as none of the coreir.mem instances have init...which is weird.

Continuing to debug.

rdaly525 commented 4 years ago

coreir.mem has a genarg called has_init which if true forces coreir.mem to have the modarg "init". I think there was a recentish update to the verilog for memory dealing with this initialization

rsetaluri commented 4 years ago

That is correct. The reason it's weird is because all the coreir.mem instances in the json have has_init = false.

Wondering if it's coming from a default or some external module or something.

rsetaluri commented 4 years ago

@rdaly525 looks like the issue is in the definition of rom2 in src/ir/headers/memories.hpp.

(edit: rom -> rom2)

leonardt commented 4 years ago

I can help with this, @rsetaluri any more info on the issue with that header? Here's the relevant recent change for the init parameter PR, maybe this will help: https://github.com/rdaly525/coreir/pull/900/files

IIRC, I don't think I ever tested with rom2, what's the difference between that and rom?

rsetaluri commented 4 years ago

@leonardt think we're narrowing in...keep you posted

rdaly525 commented 4 years ago

Diagnosed the issue. when generating verilog code for instantiating coreir.mem with init we are assuming we have access to the json directly. This is not the case in Roms where the jsom is passed to the Rom's init parameter. In verilog, when instantiating the memory, the verilog identifier 'init' just needs to be passed from the Rom module parameters to mem instantiation parameters.

leonardt commented 4 years ago

Makes sense, was just about to comment that. Is there a way to check the type of the parameter and see that it should just be symbolic? Right now the code assumes it's JSON, but maybe that symbolic parameter has a different type we can dispatch on?

rsetaluri commented 4 years ago

Yup exactly. In addition we need to do the json -> bit literal conversion for the parent (rom2) as well. Trying it out.

leonardt commented 4 years ago

Great, let me know if you need any help with it, thanks for looking into this.

rsetaluri commented 4 years ago

@dillonhuff is this snippet of the json correct?

          "curve$1":{
            "genref":"memory.rom2",
            "genargs":{"depth":["Int",256], "width":["Int",16]},
            "modargs":{"init":["Json",{"init":[0,0,0,0,0,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,6,6,6,6,7,7,7,7,8,8,8,8,9,9,9,9,10,10,10,10,11,11,11,11,12,12,12,12,13,13,13,13,14,14,14,14,15,15,15,15,16,16,16,16,17,17,17,17,18,18,18,18,19,19,19,19,20,20,20,20,21,21,21,21,22,22,22,22,23,23,23,23,24,24,24,24,25,25,25,25,26,26,26,26,27,27,27,27,28,28,28,28,29,29,29,29,30,30,30,30,31,31,31,31,32,32,32,32,33,33,33,33,34,34,34,34,35,35,35,35,36,36,36,36,37,37,37,37,38,38,38,38,39,39,39,39,40,40,40,40,41,41,41,41,42,42,42,42,43,43,43,43,44,44,44,44,45,45,45,45,46,46,46,46,47,47,47,47,48,48,48,48,49,49,49,49,50,50,50,50,51,51,51,51,52,52,52,52,53,53,53,53,54,54,54,54,55,55,55,55,56,56,56,56,57,57,57,57,58,58,58,58,59,59,59,59,60,60,60,60,61,61,61,61,62,62,62,62,63,63,63]}]}
          },

note that the json for "init" modarg, has "init" as a field. I don't think we're expecting that

rsetaluri commented 4 years ago

for example for coreir.mem we expect the json to be like:

             "modargs":{"init":["Json",[5,0,21,11]]}
rsetaluri commented 4 years ago

@dillonhuff Ok if I remove that double nesting (and with the required CoreIR changes), I'm able to get verilog (emailed it cuz too big for github).

dillonhuff commented 4 years ago

@rsetaluri aah I see. @jeffsetter actually generates the rom2 instances, so I am not actually in control of that part of the json formatting.

jeffsetter commented 4 years ago

Yea, the format I've used for creating the JSON argument has always had these double nesting. In the past I've had issues indexing the top-level json with an int, but it appears to work right now without the ["init"].

rsetaluri commented 4 years ago

Got it. Can we generate them without the double nesting now?

jeffsetter commented 4 years ago

Yes, it appears to work right now without the ["init"].