Open awaelchli opened 1 year ago
AFAIK, only XLA's and Habana's strategies use this.
The only downside I see is that people unfamiliar with this detail of these 2 strategies who are reading a script (lit-llama generate.py for example) could get the idea that calling this is required in general.
On the other hand, having it under an if device.type == "xla"
explicitly shows that this is just for XLA support.
If done, I would strongly suggest that it is defined in the strategy. Fabric can add a shortcut for it of course.
cc @lantiga who also suggested this
This issue has been automatically marked as stale because it hasn't had any recent activity. This issue will be closed in 7 days if no further activity occurs. Thank you for your contributions - the Lightning Team!
Description & Motivation
In #265, the inference code gets sped up by inserting the XLA mark_step() calls. Should Fabric provide an accelerator/strategy-agnostic method for that?
Pitch
It could be implemented directly like this in the Fabric class:
The convenience method enables you to call
fabric.mark_step()
directly in the loop and no code changes are required once switching to another accelerator or strategy.Alternatives
The above proposes to add the logic directly to the Fabric object since it is very simple and concerns only the XLA strategy and to keep the base strategy interface lean. If this turns out to generalize and be required for other accelerators in the future, we could then decide to add the interface to the strategy base class.
Additional context
No response
cc @borda @carmocca @justusschock @awaelchli