avadhpatel / marss

PTLsim and QEMU based Computer Architecture Research Simulator
http://www.marss86.org
128 stars 63 forks source link

Simulator aborting some times #67

Open b-saideepak opened 3 years ago

b-saideepak commented 3 years ago

Hi,

When ever I start to run a program in the simulation, sometimes it runs fine and sometimes it gets aborted due to some assertion failing. There are 2-3 errors because of which my simulation aborts. It runs fine if I try to repeat the execution. Is this a common issue with the simulator?. Or is it because of the modifications which I made to the simulator. If this is the case, why is it running fine sometimes? Any ideas on the reason for aborts.

Thanks for your time and any comments are welcome. Regards, Saideepak.

fitzfitsahero commented 3 years ago

There are hundreds of asserts in the code base. It is next to impossible for us to help without more information

b-saideepak commented 3 years ago

Hi,

Thank you for your reply. Here are the abort reasons that I get sometimes.

  1. ptlsim/build/core/ooo-core/ooo-pipe.cpp:2060: int ooo::ReorderBufferEntry::commit(): Assertion `ctx.get_cs_eip() == uop.rip' failed. Aborted (core dumped)

  2. *ptlsim/build/core/ooo-core/ooo.cpp:938: bool ooo::OooCore::runcycle(void): Assertion `0' failed. Aborted (core dumped)*

  3. Also, it gets aborted due to no commits, it says the reason could be deadlocked pipeline.

If it helps, I am simulating an out-of-order core with two threads. I use the following shell script to run the processes simultaneously, taskset -c 0 ./prog1 & taskset -c 1 ./prog2 Just to mention again, I am able to run the script sometimes, and sometimes the simulation gets aborted due to one of the above reasons. Any ideas if there is any mistake from my side?

Thanks for your time and any comments are welcome. Regards, Saideepak.

fitzfitsahero commented 3 years ago

Have you ran your simulations with un-altered code? And what parts of the code have you modified?

b-saideepak commented 3 years ago

Hi, I have modified the cache module in the simulator. But what I observed was, these aborts are also present with the unmodified(default, downloaded from Github) simulator. Within the unmodified simulator, the only addition that I did was to use my own machine. here is the machine that I used,

mymachine:
    description: Out of order cores with 2 threads
    min_contexts: 2
    cores:
      - type: ooo
        name_prefix: ooo_
        option:
            threads: 2
    caches:
      - type: l1_32k
        name_prefix: L1_I_
        insts: $NUMCORES
        option:
            private: true
      - type: l1_32k
        name_prefix: L1_D_
        insts: $NUMCORES
        option:
            private: true
      - type: l2_256k
        name_prefix: L2_
        insts: $NUMCORES
        option:
            private: true
            last_private: true
      - type: l3_3M
        name_prefix: L3_
        insts: 1
    memory:
      - type: dram_cont
        name_prefix: MEM_
        insts: 1 # Single DRAM controller
        option:
            latency: 50
    interconnects:
      - type: p2p
        connections:
          - core_$: I
            L1_I_$: UPPER
          - core_$: D
            L1_D_$: UPPER
          - L1_I_$: LOWER
            L2_$: UPPER
          - L1_D_$: LOWER
            L2_$: UPPER2
          - L3_0: LOWER
            MEM_0: UPPER
      - type: split_bus
        connections:
          - L2_*: LOWER
            L3_0: UPPER 

I came up with this machine with the help of sample machine given in the documentation. Can you kindly see if there are any mistakes in the machine designed?