to estimate the total time spent on our workload. Smaller total_time means roughly faster queue!
Drawbacks of the current setup.
This is an unrealistic depicition of the way switches process packets.
IRL, our queues can't look into the future and know the entire workload of pushes and pops at the start.
IRL, there may come times where our queue does nothing (when packets are in flight and it's not yet time to call pop).
However, since our test harness tries to process all pushes and pops as fast as possible, our tests have no idle time!
We remedy this by making a benchmarking harness that more closely models actual PCAPs. Broadly, we wish to do the following:
Fix a specific clock period for our queues.
Fix a specific line-rate: i.e. the rate at which we call pop.
For each push in our workload, keep track of an "arrival time" for the associated packet.
Actually push a packet only once its arrival time has passed.
the hardware can do this by counting cycles since we've fixed the clock period
Challenges with the new setup.
Benchmarking our queues becomes trickier: there's no longer a single number (total_time) we can use to compare designs.
Instead, we might consider some subset of the following:
generate graphs similar to those produced by our simulator
keep track of how often overflow/underflow occurs
Perhaps we can qualitatively compare queues with the helps of these statistics.
We can no longer use this setup to check the correctness of our hardware.
the number of cycles spent to push and pop now influences the order packets are popped
Plan
[ ] Write script to parse PCAPs and generate a .data file.
The data file should include the following:
commands, values, ans_mem as usual
arrival_cycles, to keep track of the packet's arrival time for each push
mac_addrs, to keep track of the packet's source for each push; we'll use this for flow inference
[ ] Make a calyx component similar to queue_call.py to repeatedly invoke our queue.
[ ] Generate graphs for our queues in the style of Formal Abstractions and our simulator.
These subtasks will likely be expanded upon or tweaked as I work them.
Overview
At a high level, our Shared Testing Harness works by processing a workload of pushes and pops as quickly as possible.
Benefits of the current setup.
to estimate the total time spent on our workload. Smaller
total_time
means roughly faster queue!Drawbacks of the current setup.
push
es andpop
s at the start.pop
). However, since our test harness tries to process allpush
es andpop
s as fast as possible, our tests have no idle time!We remedy this by making a benchmarking harness that more closely models actual PCAPs. Broadly, we wish to do the following:
pop
.push
in our workload, keep track of an "arrival time" for the associated packet.push
a packet only once its arrival time has passed.Challenges with the new setup.
total_time
) we can use to compare designs. Instead, we might consider some subset of the following:push
andpop
now influences the order packets are poppedPlan
.data
file. The data file should include the following:commands
,values
,ans_mem
as usualarrival_cycles
, to keep track of the packet's arrival time for eachpush
mac_addrs
, to keep track of the packet's source for eachpush
; we'll use this for flow inferencequeue_call.py
to repeatedly invoke our queue.These subtasks will likely be expanded upon or tweaked as I work them.