Open jmulsy opened 11 months ago
Thanks for your interest in our work. Could you please paste a more complete log of the code run?
Thank you for your reply. Here are my questions:
Firstly, To use Wandb offline, I wrote this code in train.py:
When I input the following code for training:
python train.py configs/shapenet/train_3k_noise.yaml
the terminal showed:
09-29 18:58:24 (train.py:72) [INFO] Intelligent GPU selection: 0
wandb: WARNING `resume` will be ignored since W&B syncing is set to `offline`. Starting a new run with run id 6zddnqd7.
wandb: Tracking run with wandb version 0.15.10
wandb: W&B syncing is set to `offline` in this directory.
wandb: Run `wandb online` or set WANDB_MODE=online to enable cloud syncing.
Global seed set to 0
/home/lisy/anaconda3/envs/nksr/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:447: LightningDeprecationWarning: Setting `Trainer(gpus=1)` is deprecated in v1.7 and will be removed in v2.0. Please use `Trainer(accelerator='gpu', devices=1)` instead.
rank_zero_deprecation(
Auto select gpus: [0]
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
>>>> ======= MODEL HYPER-PARAMETERS ======= <<<<
exec: null
include: null
visualize: false
test_set_shuffle: false
...
...
...
...
...
...
random_seed: fixed
_shapenet_transforms:
- name: PointcloudNoise
args:
stddev: 0.005
- name: SubsamplePointcloud
args:
'N': 3000
>>>> ====================================== <<<<
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
09-29 18:58:38 (train.py:316) [INFO]
Wandb Run of nkfw-shapenet/6zddnqd7 (with name noise_3k/0929-big-vehicle) marked to be cleared.
wandb: Waiting for W&B process to finish... (failed -1).
wandb: You can sync this run to the cloud by running:
wandb: wandb sync /home/lisy/NKSR/wandb/offline-run-20230929_185825-6zddnqd7
wandb: Find logs at: ./wandb/offline-run-20230929_185825-6zddnqd7/logs
Then, I used 'wandb sync', and the terminal showed that:
Find logs at: /home/lisy/NKSR/wandb/debug-cli.lisy.log
Syncing: https://wandb.ai/lisy0408/nkfw-shapenet/runs/6zddnqd7 ... done.
Lastly, I went to the link of Wandb and the result was as follows:
When cancel the 'offline', the result of failure also appears:
- name: PointcloudNoise
args:
stddev: 0.005
- name: SubsamplePointcloud
args:
'N': 3000
>>>> ====================================== <<<<
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
09-29 19:12:25 (train.py:316) [INFO]
Wandb Run of lisy0408/nkfw-shapenet/mud4kkdq (with name noise_3k/0929-grave-angle) marked to be cleared.
wandb: Waiting for W&B process to finish... (failed -1). Press Control-C to abort syncing.
wandb: 🚀 View run noise_3k/0929-grave-angle at: https://wandb.ai/lisy0408/nkfw-shapenet/runs/mud4kkdq
wandb: ️⚡ View job at https://wandb.ai/lisy0408/nkfw-shapenet/jobs/QXJ0aWZhY3RDb2xsZWN0aW9uOjEwMDY5MTM5OA==/version_details/v3
wandb: Synced 6 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s)
wandb: Find logs at: ./wandb/run-20230929_191209-mud4kkdq/logs
Hello, first of all, thank you for your outstanding work. Then, I have a problem and need your help. When I train the model, using Wandb, whether online or offline, this problem always occurs: wandb: Waiting for W&B process to finish... (failed -1) I don't know what caused this problem. Could you give some suggestions?