To reproduce the steps i followed the youtube tutorial and the training guide.
raise RuntimeError("Cowardly refusing to serialize non-leaf tensor which requires_grad, "
RuntimeError: Cowardly refusing to serialize non-leaf tensor which requires_grad, since autograd does not support crossing process boundaries. If you just want to transfer the data, call detach() on the tensor before serializing (e.g., putting it on the queue).
Here is the command:
python3 -m piper_train --dataset-dir /mnt/c/Users/Code/Desktop/AI\ Tools/piper/Model-Check-Point --accelerator 'cpu' --devices 4 --batch-size 16 --validation-split 0.0 --num-test-examples 0 --max_epochs 10000 --resume_from_checkpoint /mnt/c/Users/Code/Desktop/AI\ Tools/piper/Model-Check-Point/epoch=2218-step=838782.ckpt --checkpoint-epochs 1 --precision 32 --quality high
If i use --device 1 I get this error:
UserWarning: Be aware that when using ckpt_path, callbacks used to create the checkpoint need to be provided during Trainer instantiation. Please add the following callbacks: ["ModelCheckpoint{'monitor': None, 'mode': 'min', 'every_n_train_steps': 0, 'every_n_epochs': 1, 'train_time_interval': None}"].
rank_zero_warn(
DEBUG:fsspec.local:open file: /mnt/c/Users/Code/Desktop/AI Tools/piper/Model-Check-Point/lightning_logs/version_9/hparams.yaml
Restored all states from the checkpoint file at /mnt/c/Users/Code/Desktop/AI Tools/piper/Model-Check-Point/epoch=2218-step=838782.ckpt
/root/piper/src/python/.venv/lib/python3.10/site-packages/pytorch_lightning/utilities/data.py:153: UserWarning: Total length of DataLoader across ranks is zero. Please make sure this was your intention.
rank_zero_warn(
/root/piper/src/python/.venv/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:236: PossibleUserWarning: The dataloader, train_dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of the num_workers argument(try 4 which is the number of cpus on this machine) in theDataLoader` init to improve performance.
rank_zero_warn(
Killed
To reproduce the steps i followed the youtube tutorial and the training guide.
raise RuntimeError("Cowardly refusing to serialize non-leaf tensor which requires_grad, " RuntimeError: Cowardly refusing to serialize non-leaf tensor which requires_grad, since autograd does not support crossing process boundaries. If you just want to transfer the data, call detach() on the tensor before serializing (e.g., putting it on the queue).
Here is the command: python3 -m piper_train --dataset-dir /mnt/c/Users/Code/Desktop/AI\ Tools/piper/Model-Check-Point --accelerator 'cpu' --devices 4 --batch-size 16 --validation-split 0.0 --num-test-examples 0 --max_epochs 10000 --resume_from_checkpoint /mnt/c/Users/Code/Desktop/AI\ Tools/piper/Model-Check-Point/epoch=2218-step=838782.ckpt --checkpoint-epochs 1 --precision 32 --quality high
If i use --device 1 I get this error: UserWarning: Be aware that when using
ckpt_path
, callbacks used to create the checkpoint need to be provided duringTrainer
instantiation. Please add the following callbacks: ["ModelCheckpoint{'monitor': None, 'mode': 'min', 'every_n_train_steps': 0, 'every_n_epochs': 1, 'train_time_interval': None}"]. rank_zero_warn( DEBUG:fsspec.local:open file: /mnt/c/Users/Code/Desktop/AI Tools/piper/Model-Check-Point/lightning_logs/version_9/hparams.yaml Restored all states from the checkpoint file at /mnt/c/Users/Code/Desktop/AI Tools/piper/Model-Check-Point/epoch=2218-step=838782.ckpt /root/piper/src/python/.venv/lib/python3.10/site-packages/pytorch_lightning/utilities/data.py:153: UserWarning: Total length ofDataLoader
across ranks is zero. Please make sure this was your intention. rank_zero_warn( /root/piper/src/python/.venv/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:236: PossibleUserWarning: The dataloader, train_dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of thenum_workers
argument(try 4 which is the number of cpus on this machine) in the
DataLoader` init to improve performance. rank_zero_warn( Killed