Open dbl001 opened 2 years ago
Hey @dbl001 How have you installed PyTorch and Lightning? And which version?
Note that your Python interpreter must run natively and not through Rosetta, otherwise it won't detect the M1 hardware correctly. If you are using conda for example, you can double check this by running
conda info
and your output should say something like
platform : osx-arm64
If it shows intel x86, then re-install the correct conda version with M1 support.
cc: @justusschock :otter:
I am running on iMac 27β Intel with an AMD GPU (not M1). Will βLightningβ support this configuration
Since you don't have an M1, accelerator="mps"
is not correct. If you want to use the AMD GPU, you need to install pytorch with ROCm support. Select it here in the installation matrix (fifth row).
While I can't test it myself (don't have an AMD GPU), the expectation is that torch will detect it. The cuda semantics in torch for AMD GPUs are the same, meaning torch.cuda.device_count()
will return 1 for you.
So once you have pytorch installed with ROCm, you should be able to use
Trainer(accelerator="gpu", devices=1)
Again, can't verify but this is the expected case based on torch's documentation.
ROCm only runs on Linux (if not not mistaken. Iβm running MacOS Ventura 13.01. βMPSβ is currently working in PyTorch 14 nightly as well as Tensorflow macOS/tensorflow metal.
I was interested whether you will support the βMPSβ (e.g. Metal) interface.
Thanks in advance.
On Dec 5, 2022, at 10:22 AM, Adrian WΓ€lchli @.***> wrote:
Since you don't have an M1, accelerator="mps" is not correct. If you want to use the AMD GPU, you need to install pytorch with ROCm support. Select it here in the installation matrix https://pytorch.org/ (fifth row).
While I can't test it myself (don't have an AMD GPU), the expectation is that torch will detect it. The cuda semantics in torch for AMD GPUs are the same, meaning torch.cuda.device_count() will return 1 for you.
So once you have pytorch installed with ROCm, you should be able to use
Trainer(accelerator="gpu", devices=1) Again, can't verify but this is the expected case based on torch's documentation.
β Reply to this email directly, view it on GitHub https://github.com/Lightning-AI/lightning/issues/15861#issuecomment-1337899995, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAXWFWZ3BWMBQ6VSFLZXT5LWLYXG5ANCNFSM6AAAAAASOYAP2U. You are receiving this because you were mentioned.
@dbl001 I understand now what you mean. I couldn't find any official reference from PyTorch regarding support of MPS on AMD hardware. https://pytorch.org/docs/stable/notes/mps.html But there are some users reporting that it works.
If you @dbl001 or someone from the community has the hardware setup to test this, please feel free to send a PR with the necessary changes to Lightning to enable this. The main change probably needs to be in the availability check here: https://github.com/Lightning-AI/lightning/blob/32cf1faa07bf9b6d774cb724d4e35328bbf48b57/src/lightning_lite/accelerators/mps.py#L61-L66
https://pytorch.org/blog/introducing-accelerated-pytorch-training-on-mac/
https://developer.apple.com/metal/pytorch/
Hereβs one of the PyTorch βMPSβ threads on Github:
General MPS op coverage tracking issue #77764
I have the hardware/software required to test this. 'PYSR' uses pytorch_lightning. Here's what happened when I tried to run a model:
Does torch.backends.mps.is_available()
return True on this machine?
If yes, could you try with modifying the code that I posted https://github.com/Lightning-AI/lightning/issues/15861#issuecomment-1338348087. The condition there probably needs to drop the platform.processor() in ("arm", "arm64")
. This isn't the proper fix but at least you could then try to run the Trainer on the device (maybe).
After adjusting the code as per your recommendation,
torch.backends.mps.is_available()
True
GPU available: True (mps), used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
/Users/davidlaxer/anaconda3/envs/pysr/lib/python3.9/site-packages/pytorch_lightning/trainer/setup.py:200: UserWarning: MPS available but not used. Set `accelerator` and `devices` using `Trainer(accelerator='mps', devices=1)`.
rank_zero_warn(
The model is training on the CPU. I do not see that the GPU is 'active' in the Activity Monitor.
I can try pytorch_lightning accessing the AMD GPU via MPS on other 'lightning' examples.
I tried the 'BERT' model ...
seed_everything(42)
dm = GLUEDataModule(model_name_or_path="albert-base-v2", task_name="cola")
dm.setup("fit")
model = GLUETransformer(
model_name_or_path="albert-base-v2",
num_labels=dm.num_labels,
eval_splits=dm.eval_splits,
task_name=dm.task_name,
)
trainer = Trainer(
max_epochs=1,
accelerator="auto",
devices=1 if torch.backends.mps.is_available() else None, # limiting got iPython runs
)
trainer.fit(model, datamodule=dm)
Global seed set to 42
Found cached dataset glue (/Users/davidlaxer/.cache/huggingface/datasets/glue/cola/1.0.0/dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad)
100%|βββββββββββββββββββββββββββββββββββββββββββββ| 3/3 [00:00<00:00, 420.47it/s]
Loading cached processed dataset at /Users/davidlaxer/.cache/huggingface/datasets/glue/cola/1.0.0/dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad/cache-20965da3ce0503bd.arrow
Loading cached processed dataset at /Users/davidlaxer/.cache/huggingface/datasets/glue/cola/1.0.0/dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad/cache-7d22f08182fc38e7.arrow
0%| | 0/2 [00:00<?, ?ba/s]/Users/davidlaxer/anaconda3/envs/pysr/lib/python3.9/site-packages/transformers/tokenization_utils_base.py:2304: FutureWarning: The `pad_to_max_length` argument is deprecated and will be removed in a future version, use `padding=True` or `padding='longest'` to pad to the longest sequence in the batch, or use `padding='max_length'` to pad to a max length. In this case, you can give a specific length with `max_length` (e.g. `max_length=45`) or leave max_length to None to pad to the maximal input size of the model (e.g. 512 for Bert).
warnings.warn(
100%|ββββββββββββββββββββββββββββββββββββββββββββββ| 2/2 [00:00<00:00, 38.85ba/s]
Some weights of the model checkpoint at albert-base-v2 were not used when initializing AlbertForSequenceClassification: ['predictions.bias', 'predictions.dense.bias', 'predictions.LayerNorm.weight', 'predictions.decoder.weight', 'predictions.LayerNorm.bias', 'predictions.decoder.bias', 'predictions.dense.weight']
- This IS expected if you are initializing AlbertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing AlbertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of AlbertForSequenceClassification were not initialized from the model checkpoint at albert-base-v2 and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
---------------------------------------------------------------------------
MisconfigurationException Traceback (most recent call last)
Cell In [8], line 12
4 dm.setup("fit")
5 model = GLUETransformer(
6 model_name_or_path="albert-base-v2",
7 num_labels=dm.num_labels,
8 eval_splits=dm.eval_splits,
9 task_name=dm.task_name,
10 )
---> 12 trainer = Trainer(
13 max_epochs=1,
14 accelerator="auto",
15 devices=1 if torch.backends.mps.is_available() else None, # limiting got iPython runs
16 )
17 trainer.fit(model, datamodule=dm)
File ~/anaconda3/envs/pysr/lib/python3.9/site-packages/pytorch_lightning/utilities/argparse.py:340, in _defaults_from_env_vars.<locals>.insert_env_defaults(self, *args, **kwargs)
337 kwargs = dict(list(env_variables.items()) + list(kwargs.items()))
339 # all args were already moved to kwargs
--> 340 return fn(self, **kwargs)
File ~/anaconda3/envs/pysr/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py:408, in Trainer.__init__(self, logger, enable_checkpointing, callbacks, default_root_dir, gradient_clip_val, gradient_clip_algorithm, num_nodes, num_processes, devices, gpus, auto_select_gpus, tpu_cores, ipus, enable_progress_bar, overfit_batches, track_grad_norm, check_val_every_n_epoch, fast_dev_run, accumulate_grad_batches, max_epochs, min_epochs, max_steps, min_steps, max_time, limit_train_batches, limit_val_batches, limit_test_batches, limit_predict_batches, val_check_interval, log_every_n_steps, accelerator, strategy, sync_batchnorm, precision, enable_model_summary, num_sanity_val_steps, resume_from_checkpoint, profiler, benchmark, deterministic, reload_dataloaders_every_n_epochs, auto_lr_find, replace_sampler_ddp, detect_anomaly, auto_scale_batch_size, plugins, amp_backend, amp_level, move_metrics_to_cpu, multiple_trainloader_mode, inference_mode)
405 # init connectors
406 self._data_connector = DataConnector(self, multiple_trainloader_mode)
--> 408 self._accelerator_connector = AcceleratorConnector(
409 num_processes=num_processes,
410 devices=devices,
411 tpu_cores=tpu_cores,
412 ipus=ipus,
413 accelerator=accelerator,
414 strategy=strategy,
415 gpus=gpus,
416 num_nodes=num_nodes,
417 sync_batchnorm=sync_batchnorm,
418 benchmark=benchmark,
419 replace_sampler_ddp=replace_sampler_ddp,
420 deterministic=deterministic,
421 auto_select_gpus=auto_select_gpus,
422 precision=precision,
423 amp_type=amp_backend,
424 amp_level=amp_level,
425 plugins=plugins,
426 )
427 self._logger_connector = LoggerConnector(self)
428 self._callback_connector = CallbackConnector(self)
File ~/anaconda3/envs/pysr/lib/python3.9/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:213, in AcceleratorConnector.__init__(self, devices, num_nodes, accelerator, strategy, plugins, precision, amp_type, amp_level, sync_batchnorm, benchmark, replace_sampler_ddp, deterministic, auto_select_gpus, num_processes, tpu_cores, ipus, gpus)
210 elif self._accelerator_flag == "gpu":
211 self._accelerator_flag = self._choose_gpu_accelerator_backend()
--> 213 self._set_parallel_devices_and_init_accelerator()
215 # 3. Instantiate ClusterEnvironment
216 self.cluster_environment: ClusterEnvironment = self._choose_and_init_cluster_environment()
File ~/anaconda3/envs/pysr/lib/python3.9/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:547, in AcceleratorConnector._set_parallel_devices_and_init_accelerator(self)
543 self._tpu_cores = self._devices_flag if not self._tpu_cores else self._tpu_cores
545 self._set_devices_flag_if_auto_select_gpus_passed()
--> 547 self._devices_flag = accelerator_cls.parse_devices(self._devices_flag)
548 if not self._parallel_devices:
549 self._parallel_devices = accelerator_cls.get_parallel_devices(self._devices_flag)
File ~/anaconda3/envs/pysr/lib/python3.9/site-packages/pytorch_lightning/accelerators/mps.py:48, in MPSAccelerator.parse_devices(devices)
45 @staticmethod
46 def parse_devices(devices: Union[int, str, List[int]]) -> Optional[List[int]]:
47 """Accelerator device parsing logic."""
---> 48 parsed_devices = _parse_gpu_ids(devices, include_mps=True)
49 return parsed_devices
File ~/anaconda3/envs/pysr/lib/python3.9/site-packages/lightning_lite/utilities/device_parser.py:104, in _parse_gpu_ids(gpus, include_cuda, include_mps)
101 # Check that GPUs are unique. Duplicate GPUs are not supported by the backend.
102 _check_unique(gpus)
--> 104 return _sanitize_gpu_ids(gpus, include_cuda=include_cuda, include_mps=include_mps)
File ~/anaconda3/envs/pysr/lib/python3.9/site-packages/lightning_lite/utilities/device_parser.py:136, in _sanitize_gpu_ids(gpus, include_cuda, include_mps)
134 for gpu in gpus:
135 if gpu not in all_available_gpus:
--> 136 raise MisconfigurationException(
137 f"You requested gpu: {gpus}\n But your machine only has: {all_available_gpus}"
138 )
139 return gpus
MisconfigurationException: You requested gpu: [0]
But your machine only has: []
β
This function in "lightning_lite/utilities/device_parser.py" is returning []:
se []
mps_gpus = accelerators.mps._get_all_available_mps_gpus() if include_mps else []
It appears that accelerators.mps._get_all_available_mps_gpus() is returning an empty list:
import lightning_lite.accelerators as accelerators
accelerators.mps._get_all_available_mps_gpus()
[]
When you 'force' _get_all_available_mps_gpus to return '[0]' pytorch_lightning utilizes the AMD Radeon Pro 5700 XT GPU from 'MPS': E.g
def _get_all_available_mps_gpus() -> List[int]:
"""
Returns:
A list of all available MPS GPUs
"""
return [0]
#return [0] if MPSAccelerator.is_available() else []
The 'lightning' BERT example runs until it get an exception trying to convert to Float64 - which 'MPS' does not support. This would also happen with the M1 & M2 hardware. E.g.
ypeError Traceback (most recent call last)
Cell In [6], line 17
5 model = GLUETransformer(
6 model_name_or_path="albert-base-v2",
7 num_labels=dm.num_labels,
8 eval_splits=dm.eval_splits,
9 task_name=dm.task_name,
10 )
12 trainer = Trainer(
13 max_epochs=1,
14 accelerator="mps",
15 devices=1
16 )
---> 17 trainer.fit(model, datamodule=dm)
File ~/anaconda3/envs/pysr/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py:582, in Trainer.fit(self, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path)
580 raise TypeError(f"`Trainer.fit()` requires a `LightningModule`, got: {model.__class__.__qualname__}")
581 self.strategy._lightning_module = model
--> 582 call._call_and_handle_interrupt(
583 self, self._fit_impl, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path
584 )
File ~/anaconda3/envs/pysr/lib/python3.9/site-packages/pytorch_lightning/trainer/call.py:38, in _call_and_handle_interrupt(trainer, trainer_fn, *args, **kwargs)
36 return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs)
37 else:
---> 38 return trainer_fn(*args, **kwargs)
40 except _TunerExitException:
41 trainer._call_teardown_hook()
File ~/anaconda3/envs/pysr/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py:624, in Trainer._fit_impl(self, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path)
617 ckpt_path = ckpt_path or self.resume_from_checkpoint
618 self._ckpt_path = self._checkpoint_connector._set_ckpt_path(
619 self.state.fn,
620 ckpt_path, # type: ignore[arg-type]
621 model_provided=True,
622 model_connected=self.lightning_module is not None,
623 )
--> 624 self._run(model, ckpt_path=self.ckpt_path)
626 assert self.state.stopped
627 self.training = False
File ~/anaconda3/envs/pysr/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py:1061, in Trainer._run(self, model, ckpt_path)
1057 self._checkpoint_connector.restore_training_state()
1059 self._checkpoint_connector.resume_end()
-> 1061 results = self._run_stage()
1063 log.detail(f"{self.__class__.__name__}: trainer tearing down")
1064 self._teardown()
File ~/anaconda3/envs/pysr/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py:1140, in Trainer._run_stage(self)
1138 if self.predicting:
1139 return self._run_predict()
-> 1140 self._run_train()
File ~/anaconda3/envs/pysr/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py:1163, in Trainer._run_train(self)
1160 self.fit_loop.trainer = self
1162 with torch.autograd.set_detect_anomaly(self._detect_anomaly):
-> 1163 self.fit_loop.run()
File ~/anaconda3/envs/pysr/lib/python3.9/site-packages/pytorch_lightning/loops/loop.py:199, in Loop.run(self, *args, **kwargs)
197 try:
198 self.on_advance_start(*args, **kwargs)
--> 199 self.advance(*args, **kwargs)
200 self.on_advance_end()
201 self._restarting = False
File ~/anaconda3/envs/pysr/lib/python3.9/site-packages/pytorch_lightning/loops/fit_loop.py:267, in FitLoop.advance(self)
265 self._data_fetcher.setup(dataloader, batch_to_device=batch_to_device)
266 with self.trainer.profiler.profile("run_training_epoch"):
--> 267 self._outputs = self.epoch_loop.run(self._data_fetcher)
File ~/anaconda3/envs/pysr/lib/python3.9/site-packages/pytorch_lightning/loops/loop.py:200, in Loop.run(self, *args, **kwargs)
198 self.on_advance_start(*args, **kwargs)
199 self.advance(*args, **kwargs)
--> 200 self.on_advance_end()
201 self._restarting = False
202 except StopIteration:
File ~/anaconda3/envs/pysr/lib/python3.9/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py:251, in TrainingEpochLoop.on_advance_end(self)
249 if should_check_val:
250 self.trainer.validating = True
--> 251 self._run_validation()
252 self.trainer.training = True
254 # update plateau LR scheduler after metrics are logged
File ~/anaconda3/envs/pysr/lib/python3.9/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py:310, in TrainingEpochLoop._run_validation(self)
307 self.val_loop._reload_evaluation_dataloaders()
309 with torch.no_grad():
--> 310 self.val_loop.run()
File ~/anaconda3/envs/pysr/lib/python3.9/site-packages/pytorch_lightning/loops/loop.py:206, in Loop.run(self, *args, **kwargs)
203 break
204 self._restarting = False
--> 206 output = self.on_run_end()
207 return output
File ~/anaconda3/envs/pysr/lib/python3.9/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py:180, in EvaluationLoop.on_run_end(self)
177 self.trainer._logger_connector.epoch_end_reached()
179 # hook
--> 180 self._evaluation_epoch_end(self._outputs)
181 self._outputs = [] # free memory
183 # hook
File ~/anaconda3/envs/pysr/lib/python3.9/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py:288, in EvaluationLoop._evaluation_epoch_end(self, outputs)
286 # call the model epoch end
287 hook_name = "test_epoch_end" if self.trainer.testing else "validation_epoch_end"
--> 288 self.trainer._call_lightning_module_hook(hook_name, output_or_outputs)
File ~/anaconda3/envs/pysr/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py:1305, in Trainer._call_lightning_module_hook(self, hook_name, pl_module, *args, **kwargs)
1302 pl_module._current_fx_name = hook_name
1304 with self.profiler.profile(f"[LightningModule]{pl_module.__class__.__name__}.{hook_name}"):
-> 1305 output = fn(*args, **kwargs)
1307 # restore current_fx when nested context
1308 pl_module._current_fx_name = prev_fx_name
Cell In [5], line 66, in GLUETransformer.validation_epoch_end(self, outputs)
64 loss = torch.stack([x["loss"] for x in outputs]).mean()
65 self.log("val_loss", loss, prog_bar=True)
---> 66 self.log_dict(self.metric.compute(predictions=preds, references=labels), prog_bar=True)
File ~/anaconda3/envs/pysr/lib/python3.9/site-packages/pytorch_lightning/core/module.py:511, in LightningModule.log_dict(self, dictionary, prog_bar, logger, on_step, on_epoch, reduce_fx, enable_graph, sync_dist, sync_dist_group, add_dataloader_idx, batch_size, rank_zero_only)
477 """Log a dictionary of values at once.
478
479 Example::
(...)
508 would produce a deadlock as not all processes would perform this log call.
509 """
510 for k, v in dictionary.items():
--> 511 self.log(
512 name=k,
513 value=v,
514 prog_bar=prog_bar,
515 logger=logger,
516 on_step=on_step,
517 on_epoch=on_epoch,
518 reduce_fx=reduce_fx,
519 enable_graph=enable_graph,
520 sync_dist=sync_dist,
521 sync_dist_group=sync_dist_group,
522 add_dataloader_idx=add_dataloader_idx,
523 batch_size=batch_size,
524 rank_zero_only=rank_zero_only,
525 )
File ~/anaconda3/envs/pysr/lib/python3.9/site-packages/pytorch_lightning/core/module.py:405, in LightningModule.log(self, name, value, prog_bar, logger, on_step, on_epoch, reduce_fx, enable_graph, sync_dist, sync_dist_group, add_dataloader_idx, batch_size, metric_attribute, rank_zero_only)
399 if "/dataloader_idx_" in name:
400 raise MisconfigurationException(
401 f"You called `self.log` with the key `{name}`"
402 " but it should not contain information about `dataloader_idx`"
403 )
--> 405 value = apply_to_collection(value, (torch.Tensor, numbers.Number), self.__to_tensor, name)
407 if self.trainer._logger_connector.should_reset_tensors(self._current_fx_name):
408 # if we started a new epoch (running its first batch) the hook name has changed
409 # reset any tensors for the new hook name
410 results.reset(metrics=False, fx=self._current_fx_name)
File ~/anaconda3/envs/pysr/lib/python3.9/site-packages/lightning_utilities/core/apply_func.py:47, in apply_to_collection(data, dtype, function, wrong_dtype, include_none, *args, **kwargs)
45 # Breaking condition
46 if isinstance(data, dtype) and (wrong_dtype is None or not isinstance(data, wrong_dtype)):
---> 47 return function(data, *args, **kwargs)
49 elem_type = type(data)
51 # Recursively apply to collection items
File ~/anaconda3/envs/pysr/lib/python3.9/site-packages/pytorch_lightning/core/module.py:541, in LightningModule.__to_tensor(self, value, name)
537 def __to_tensor(self, value: Union[torch.Tensor, numbers.Number], name: str) -> Tensor:
538 value = (
539 value.clone().detach().to(self.device)
540 if isinstance(value, torch.Tensor)
--> 541 else torch.tensor(value, device=self.device)
542 )
543 if not torch.numel(value) == 1:
544 raise ValueError(
545 f"`self.log({name}, {value})` was called, but the tensor must have a single element."
546 f" You can try doing `self.log({name}, {value}.mean())`"
547 )
TypeError: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead.
_get_all_available_mps_gpus
@dbl001 Thanks for investigating! So in summary, these are the changes we need:
We need to revise the availability check:
https://github.com/Lightning-AI/lightning/blob/32cf1faa07bf9b6d774cb724d4e35328bbf48b57/src/lightning_lite/accelerators/mps.py#L61-L66
where platform.processor() in ("arm", "arm64")
is not general enough.
We need to update the _get_all_available_mps_gpus to parse the (rocm) "cuda" devices.
Still open for investigation is whether it would be possible to also use multiple GPUs.
Also, 'MPS' only supports 'torch.float32' tensors, so I had to change this line in 'module.py': $ vi +541 ~/anaconda3/envs/pysr/lib/python3.9/site-packages/pytorch_lightning/core/module.py
else torch.tensor(value, device=self.device, dtype=torch.float32)
And ...
Training: 0it [00:00, ?it/s]
Validation: 0it [00:00, ?it/s]
Validation: 0it [00:00, ?it/s]
Previously reported? https://github.com/Lightning-AI/lightning/issues/5039
This issue has been automatically marked as stale because it hasn't had any recent activity. This issue will be closed in 7 days if no further activity occurs. Thank you for your contributions - the Lightning Team!
π Feature
MPS support on MacOS Ventura with an AMD Radeon Pro 5700 XT GPU
Motivation
MisconfigurationException:
MPSAccelerator
can not run on your system since the accelerator is not available. The following accelerator(s) is available and can be passed intoaccelerator
argument ofTrainer
: ['cpu']. [...]. If this is related to another GitHub issue, please link it here -->Pitch
trainer = Trainer(accelerator="mps", devices=1)
Alternatives
Additional context
If you enjoy Lightning, check out our other projects! β‘
Metrics: Machine learning metrics for distributed, scalable PyTorch applications.
Lite: enables pure PyTorch users to scale their existing code on any kind of device while retaining full control over their own loops and optimization logic.
Flash: The fastest way to get a Lightning baseline! A collection of tasks for fast prototyping, baselining, fine-tuning, and solving problems with deep learning.
Bolts: Pretrained SOTA Deep Learning models, callbacks, and more for research and production with PyTorch Lightning and PyTorch.
Lightning Transformers: Flexible interface for high-performance research using SOTA Transformers leveraging PyTorch Lightning, Transformers, and Hydra.
cc @akihironitta @justusschock