bespoke-silicon-group / bsg_manycore

Tile based architecture designed for computing efficiency, scalability and generality
Other
230 stars 59 forks source link

vanilla core invalid EVA access #34

Open tommydcjung opened 5 years ago

tommydcjung commented 5 years ago

what should be the expected behavior if vanilla core invalid EVA address (e.g. address space that does not map to local DMEM, or any remote address space (in-group, global))?

options1: treat like a NOP. options2: pipeline stalls. options3: sends fail packet to host interface.

mrutt92 commented 5 years ago

I feel like this would usually happen because of a bug in a program? I think I’m in favor of a fail packet.

tommydcjung commented 5 years ago

option4: csr error register that starts as 0, and set 1 if invalid addr is accessed. definitely read, maybe write...

drichmond commented 4 years ago

Necrobump:

One thing I've done in some of my code is using bsg_print_hex to send an error. For example, in my circular buffer, i used bsg_print_hex(0xF1FO003). The last four digits corresponded to some unique location in the error code.

mrutt92 commented 4 years ago

I still feel like option3 will make life easier for all parties. Having a sticky error register sounds OK too but on its own it would force the host to probe. A packet would alert the host where the problem occurred.

mrutt92 commented 4 years ago

Also, if we have the register maybe we should make it clear-on-read?

tommydcjung commented 4 years ago

Not a real issue now. Just make sure that programmer does not screw up.

mrutt92 commented 4 years ago

What's not a real issue? The programmer will screw up - whatever we come up with should make it easy for them to figure out why. I think a packet can achieve that pretty easily.

tommydcjung commented 4 years ago

It adds more hardware for something that can be easily fixed in software.

mrutt92 commented 4 years ago

Are you referring to clear-on-read or the packet?

tommydcjung commented 4 years ago

Either solution can be quite expensive. Say there is 64x32 array, now you have 2048 of those. It also creates a slippery slope. Programmers will be like "now I want this. now I want that". It never ends.

mrutt92 commented 4 years ago

I don't think the slippery slope argument is a particularly strong one. "Just think - they might actually ask for something they can debug"

As for the expense... if we don't care to support debugging on anything but simulation then you have a point. And maybe we don't?

mrutt92 commented 4 years ago

but you know what? I agree that this doesn't seem like a priority.

dpetrisko commented 4 years ago

Is it always possible to know in hardware that an EVA is bad before it goes on the network? Is it always possible in software?

tommydcjung commented 4 years ago

For now, accessing invalid space will trigger assertion error, and that is good enough.

dpetrisko commented 4 years ago

One thing to consider for the future: with a hardware EVA-NPA mechanism, there will need to be a mechanism for faults. Most likely a host packet, although I’m sure there are other solutions.

drichmond commented 4 years ago

Perhaps not relevant to this discussion, but I believe that an EVA access off-network does not currently trigger an assertion. Sasha is trying to confirm this.

Perhaps a solution might be to put a module on the edge that spits out an error packet to the origin when it gets a packet.

Smaller cost

On Mar 3, 2020, at 2:55 PM, Dan Petrisko notifications@github.com wrote:

 One thing to consider for the future: with a hardware EVA-NPA mechanism, there will need to be a mechanism for faults. Most likely a host packet, although I’m sure there are other solutions.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

drichmond commented 4 years ago

Admittedly, my current suggestion breaks the manycore network protocol

On Mar 3, 2020, at 2:55 PM, Dan Petrisko notifications@github.com wrote:

 One thing to consider for the future: with a hardware EVA-NPA mechanism, there will need to be a mechanism for faults. Most likely a host packet, although I’m sure there are other solutions.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

tommydcjung commented 4 years ago

https://github.com/bespoke-silicon-group/bsg_manycore/blob/master/v/bsg_manycore_link_sif_tieoff.v#L69 I don't believe that is true, but I see that it should be re-written as assert.