Open tfogal opened 2 days ago
But what's the traceback? I don't see that error being raised anywhere in the thunder repo...
I don't see that error being raised anywhere in the thunder repo...
Yes, ditto. We think this is coming from the mixology scripts.
There's an internal thread with @wprazuch; stay tuned, we'll report back here. I can't seem to assign to @wprazuch (?) so assigning to me temporarily instead.
Thanks @tfogal for notifying! Yes, that is the functionality we introduced internally in our fork, since there was request to benchmark additionally FP8 TransformerEngine for lit-gpt, and we added that functionality. I think right now this issue is not relevant for the main repository.
But because we are speaking about this right now - I could create a PR for adding this functionality for the main repo, if you are interested in tracking and benchmarking FP8 as well. I wanted to do that some time ago, but due to other tasks I de-prioritized it. Let me know what you think about this.
I could create a PR for adding this functionality for the main repo, if you are interested in tracking and benchmarking FP8 as well. I wanted to do that some time ago, but due to other tasks I de-prioritized it. Let me know what you think about this.
That would be great! Yes, I do think we should be tracking FP8 perf over time.
I will prepare required changes then
🚀 Feature
120 Mixology runs are failing due to:
Additional context