codes-org / codes

The Co-Design of Exascale Storage Architectures (CODES) simulation framework builds upon the ROSS parallel discrete event simulation engine to provide high-performance simulation utilities and models for building scalable distributed systems simulations
Other
40 stars 16 forks source link

model-net seemingly causing non-trivial performance overheads (Imported #50) #50

Closed nmcglo closed 4 years ago

nmcglo commented 8 years ago

Original Issue Author: Jonathan Jenkins Original Issue ID: 50 Original Issue URL: https://xgitlab.cels.anl.gov/codes/codes/issues/50


We've heard about this a few times now (Ning with the fattree, Misbah with the dragonfly model / MPI replay program, and potentially Misbah/Caitlin with "awesim" runs) - something is causing performance regressions (resulting in less ROSS efficiency) when using model-net vs direct ROSS in optimistic mode.

It's unclear why this is happening for the time being. model-net imposes two extra events (sched-new, sched-next) per model_net_event call, the first of which is a remote from the client (the original "packet event" that the client used to directly send is now a self-event). I would imagine there to be some degree of overhead for this, but nothing that would significantly affect the rollback rate / ROSS efficiency...

We should keep this in the back of our minds while we are working on other things.

nmcglo commented 8 years ago

Jonathan Jenkins:

I reduced the event overhead by an event in the case of empty queues (happens more often than one would think...). See #81. Not sure if that's enough to solve the underlying problem though...

nmcglo commented 8 years ago

Jonathan Jenkins:

The dragonfly has been optimized a good amount - a lesser number of events internally are being issued, and the event processing logic does not result in building up the event queue arbitrarily any more. Once the optimizations are also applied to the torus, we'll close this ticket.

nmcglo commented 4 years ago

Ticket wasn't actually closed on Jan 14, 2016. Closing now.