calyxir / calyx

Intermediate Language (IL) for Hardware Accelerator Generators
https://calyxir.org
MIT License
493 stars 50 forks source link

Support `sync` without `std_sync_reg` #1333

Closed rachitnigam closed 1 year ago

rachitnigam commented 1 year ago

I realized that the FSMs generated by TDCC are probably sufficient to implement the synchronization that we implement using std_sync_reg. The high-level idea is that each thread in a par block gets its own FSM. A @sync in multiple different threads means that the FSMs of the corresponding threads must synchronize before continuing execution. This can be done by generating a transition condition that waits for all FSMs to reach the corresponding state before allowing any one of them to move forward.

@paili0628 I think this would be a good thing to implement before we write up a paper on this stuff

The PDF document summarizes this idea visually: Sync using FSMs.pdf

sampsyo commented 1 year ago

This is a pretty interesting idea. To state a couple of probably-obvious observations:

rachitnigam commented 1 year ago

But I kinda like the idea of hypothetically supporting separately-compiled synchronizing code using the heavyweight fallback of a synchronizing register.

Hm, I think that could be interesting but I'm not sure if it needs to be language-level , that is, if you want to do this kind of synchronization, you probably want to put a FIFO between the two components and perform fine-grained synchronization

rachitnigam commented 1 year ago

it would be rad in a fantasy world if there were some way where the TDCC pass could expose a generic interface

Yeah, I had the exact same thought–is there some way to implement this separately from the TDCC code itself? In this case, it'd require exposing the FSM transitions in some structured manner.

sampsyo commented 1 year ago

Hm, I think that could be interesting but I'm not sure if it needs to be language-level , that is, if you want to do this kind of synchronization, you probably want to put a FIFO between the two components and perform fine-grained synchronization

Fair enough! This kind of facility is purely hypothetical and doesn't really have a use case. :smiley: Just food for thought.

paili0628 commented 1 year ago

Could there be a possible case where we could have problem resetting the barrier for the mechanism proposed?

paili0628 commented 1 year ago

I think there are a couple of things we should pay attention to before doing this:

  1. for the following program:
par {
  // thread A 
  while lt.out {
    @Node_id(1) one; 
    @sync(1); 
    @Node_id(2) two;
   }  -> F1

  // thread B
  while lt.out {
     @Node_id(1) three; 
     @sync(1);
   }  -> F2
}

Assuming that the FSM we generate looks like below (just using the notation in the PDF): s1 <- sync_1.done; s1 = F1 = 1 & F2 = 1

In this case , what would happen could be: At time slot 1: one and three run together 2: thread A and B sync 3: two runs first time, thread B starts second iteration 4: two runs first time, F1.out is still 1, thread B finishes running three, sees s1 is up and starts third iteration

The problem is, if my understanding is somewhat accurate, as we design the FSM, F1.out is always set to 1 while two[done] is not activated. Hence if our sync signal solely relies on the fsm register (in this case F1 and F2), and if two runs for a really long time, and we start another iteration of thread B before two even finishes, we could end up running multiple iterations of thread B while running two for the first time.

  1. The current @sync attribute can only be marked before empty control with last semester's syntax change. So can tdcc handle that if the empty control is not eliminated beforehand?
rachitnigam commented 1 year ago

Hey @paili0628, you can edit the comments by clicking three dots on the top right. The code is really hard to read so can you try reformatting it a bit and explaining what you're expecting will happen event-by-event?

paili0628 commented 1 year ago

@rachitnigam I updated my comment, plz take a look.

rachitnigam commented 1 year ago

I cleaned up the comment a bit more. Usually, you can use the code block syntax (three backticks) and click on the preview link to see what the final comment is going to look like. Take a look at the raw markdown by clicking on the edit button and try to use code blocks with indenting in the future

Couple of things:

One way to proceed would be to sketch out exactly what a compiled group and corresponding hardware should look like. Once we have some concrete code to stare at, we can decide if it will work or not