AI-Hypercomputer / tpu-recipes

Apache License 2.0
2 stars 0 forks source link

Add training instructions for Mixtral using PyTorch on trillium #2

Closed bhavya01 closed 1 week ago

bhavya01 commented 1 week ago

Add instructions and artifacts to run Mixtral 8x7B on PyTorch Trillium TPU

zpcore commented 1 week ago

Thanks, does the GCE instruction support multi slice?

bhavya01 commented 1 week ago

Thanks, does the GCE instruction support multi slice?

I never tried running mixtral on multislice with GCE. It is possible though using queued resources.