cellarium-ai / cellarium-ml

Distributed single-cell data analysis.
BSD 3-Clause "New" or "Revised" License
11 stars 2 forks source link

RuntimeError: Expected a 'cuda' device type for generator but found 'cpu' when training logistic regression #190

Closed fedorgrab closed 3 months ago

fedorgrab commented 3 months ago

Trace log:

jupyter@vm-11232:~/src$ cellarium-ml logistic_regression fit --config config.yml 
/home/jupyter/.local/lib/python3.10/site-packages/lightning/fabric/utilities/seed.py:40: No seed found, seed set to 0
Seed set to 0
/home/jupyter/src/cellarium-ml/cellarium/ml/utilities/data.py:159: UserWarning: The given NumPy array is not writable, and PyTorch does not support non-writable tensors. This means writing to this tensor will result in undefined behavior. You may want to copy the array to protect its data or make it writable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:206.)
  collated_batch[key] = torch.cat([torch.from_numpy(data[key]) for data in batch], dim=0)
/home/jupyter/.local/lib/python3.10/site-packages/lightning/pytorch/utilities/parsing.py:199: Attribute 'model' is an instance of `nn.Module` and is already saved during checkpointing. It is recommended to ignore them using `self.save_hyperparameters(ignore=['model'])`.
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
Missing logger folder: curriculum/human_10x_only_2024_05_16_ingest/models/logistic_regression_log1p_v1_test/lightning_logs
Traceback (most recent call last):
  File "/home/jupyter/.local/bin/cellarium-ml", line 8, in <module>
    sys.exit(main())
  File "/home/jupyter/src/cellarium-ml/cellarium/ml/cli.py", line 591, in main
    model_cli(args)  # run the model
  File "/home/jupyter/src/cellarium-ml/cellarium/ml/cli.py", line 418, in logistic_regression
    cli(args=args)
  File "/home/jupyter/src/cellarium-ml/cellarium/ml/cli.py", line 271, in __init__
    super().__init__(
  File "/home/jupyter/.local/lib/python3.10/site-packages/lightning/pytorch/cli.py", line 388, in __init__
    self._run_subcommand(self.subcommand)
  File "/home/jupyter/.local/lib/python3.10/site-packages/lightning/pytorch/cli.py", line 679, in _run_subcommand
    fn(**fn_kwargs)
  File "/home/jupyter/.local/lib/python3.10/site-packages/lightning/pytorch/trainer/trainer.py", line 544, in fit
    call._call_and_handle_interrupt(
  File "/home/jupyter/.local/lib/python3.10/site-packages/lightning/pytorch/trainer/call.py", line 44, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/home/jupyter/.local/lib/python3.10/site-packages/lightning/pytorch/trainer/trainer.py", line 580, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/home/jupyter/.local/lib/python3.10/site-packages/lightning/pytorch/trainer/trainer.py", line 951, in _run
    call._call_configure_model(self)
  File "/home/jupyter/.local/lib/python3.10/site-packages/lightning/pytorch/trainer/call.py", line 109, in _call_configure_model
    _call_lightning_module_hook(trainer, "configure_model")
  File "/home/jupyter/.local/lib/python3.10/site-packages/lightning/pytorch/trainer/call.py", line 157, in _call_lightning_module_hook
    output = fn(*args, **kwargs)
  File "/home/jupyter/src/cellarium-ml/cellarium/ml/core/module.py", line 104, in configure_model
    model.reset_parameters()
  File "/home/jupyter/src/cellarium-ml/cellarium/ml/models/logistic_regression.py", line 78, in reset_parameters
    self.W_gc.data.normal_(0, self.W_init_scale, generator=rng)
  File "/home/jupyter/.local/lib/python3.10/site-packages/torch/utils/_device.py", line 78, in __torch_function__
    return func(*args, **kwargs)
  File "/home/jupyter/.local/lib/python3.10/site-packages/lightning/fabric/utilities/init.py", line 51, in __torch_function__
    return func(*args, **kwargs)
RuntimeError: Expected a 'cuda' device type for generator but found 'cpu'