Open kemingy opened 1 year ago
Also https://www.usenix.org/conference/osdi22/presentation/yu
Originally posted by @VoVAllen in https://github.com/mosecorg/mosec/issues/382#issuecomment-1588622255
Although Orca coupled the scheduler and execution engine, it still has something we can learn from.
For GPT-like models, they can benefit from iteration-level scheduling in the following part:
refer to:
Originally posted by @VoVAllen in https://github.com/mosecorg/mosec/issues/382#issuecomment-1588622255