In the summarization notebook, where/when do we set the device? Are parallel gpus expected? Two things that could help: i) specify where we could set the device and call model.to(device) and ii) explicate where the model might expect data in parallel e.g. how setting batched=True in the pre-processing or how DataCollatorForSeq2Seq expects tensors.
In the summarization notebook, where/when do we set the device? Are parallel gpus expected? Two things that could help: i) specify where we could set the device and call model.to(device) and ii) explicate where the model might expect data in parallel e.g. how setting batched=True in the pre-processing or how DataCollatorForSeq2Seq expects tensors.