IGNF / myria3d

Myria3D: Aerial Lidar HD Semantic Segmentation with Deep Learning
https://ignf.github.io/myria3d/
BSD 3-Clause "New" or "Revised" License
151 stars 20 forks source link

Nuage solitaire avec unique point dans un batch = crash de l'apprentissage #83

Closed CharlesGaydon closed 9 months ago

CharlesGaydon commented 11 months ago

J'ai d'abord cru que c'était du à la configuration par défaut des DataLoader pour laquelle drop_last=False. Mais l'erreur n'apparaît pas en fin d'epoch mais avant : à 93% de la donnée (et on est bien au training_step d'après les logs)

ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 512])

La taille 1,512 m'évoque cependant un combo de :

Bon, sans avoir l'explication de pourquoi la situation se présente ici, elle peut se présenter sur des données normalesqu'on découpe à la volée donc ça vaut le coup de s'en soucier.

~Autrement : dans cette situation le bloc qui n'accepte pas un unique point est mlp_summit. Peut-être que garder minimum deux points dans la décimation qui précède directement mlp_summit est une solution plus efficace. Dans ce cas ça se passe dans ces lignes :~

~On passe de~

    # Decimation should not empty clouds completely.
    decimated_bincount = torch.max(
        torch.ones_like(decimated_bincount), decimated_bincount
    )

~à~

    # Decimation should not empty clouds completely.
    decimated_bincount = torch.max(
        2 * torch.ones_like(decimated_bincount), decimated_bincount
    )

~sans impact sur le comportement normal du modèle.~

Trace complète :

Epoch 43:  93%|█████████▎| 1396/1501 [21:46<01:38,  1.07it/s, loss=0.183, v_num=d139, train/iou_step=0.687, val/iou_step=0.675, val/iou_epoch=0.701, train/iou_epoch=0.729]
Epoch 43:  93%|█████████▎| 1396/1501 [21:46<01:38,  1.07it/s, loss=0.186, v_num=d139, train/iou_step=0.424, val/iou_step=0.675, val/iou_epoch=0.701, train/iou_epoch=0.729]
Epoch 43:  93%|█████████▎| 1397/1501 [21:47<01:37,  1.07it/s, loss=0.186, v_num=d139, train/iou_step=0.424, val/iou_step=0.675, val/iou_epoch=0.701, train/iou_epoch=0.729]
Epoch 43:  93%|█████████▎| 1397/1501 [21:47<01:37,  1.07it/s, loss=0.183, v_num=d139, train/iou_step=0.547, val/iou_step=0.675, val/iou_epoch=0.701, train/iou_epoch=0.729]
Epoch 43:  93%|█████████▎| 1398/1501 [21:48<01:36,  1.07it/s, loss=0.183, v_num=d139, train/iou_step=0.547, val/iou_step=0.675, val/iou_epoch=0.701, train/iou_epoch=0.729]
Epoch 43:  93%|█████████▎| 1398/1501 [21:48<01:36,  1.07it/s, loss=0.184, v_num=d139, train/iou_step=0.535, val/iou_step=0.675, val/iou_epoch=0.701, train/iou_epoch=0.729]
Epoch 43:  93%|█████████▎| 1399/1501 [21:49<01:35,  1.07it/s, loss=0.184, v_num=d139, train/iou_step=0.535, val/iou_step=0.675, val/iou_epoch=0.701, train/iou_epoch=0.729]
Epoch 43:  93%|█████████▎| 1399/1501 [21:49<01:35,  1.07it/s, loss=0.183, v_num=d139, train/iou_step=0.693, val/iou_step=0.675, val/iou_epoch=0.701, train/iou_epoch=0.729]
Epoch 43:  93%|█████████▎| 1400/1501 [21:49<01:34,  1.07it/s, loss=0.183, v_num=d139, train/iou_step=0.693, val/iou_step=0.675, val/iou_epoch=0.701, train/iou_epoch=0.729]
Epoch 43:  93%|█████████▎| 1400/1501 [21:49<01:34,  1.07it/s, loss=0.189, v_num=d139, train/iou_step=0.479, val/iou_step=0.675, val/iou_epoch=0.701, train/iou_epoch=0.729]
Epoch 43:  93%|█████████▎| 1401/1501 [21:51<01:33,  1.07it/s, loss=0.189, v_num=d139, train/iou_step=0.479, val/iou_step=0.675, val/iou_epoch=0.701, train/iou_epoch=0.729]
Epoch 43:  93%|█████████▎| 1401/1501 [21:51<01:33,  1.07it/s, loss=0.18, v_num=d139, train/iou_step=0.394, val/iou_step=0.675, val/iou_epoch=0.701, train/iou_epoch=0.729] Error executing job with overrides: ['task.task_name=fit', 'datamodule.hdf5_file_path=/var/data/CGaydon/myria3d_datasets/20230727_75km2_diverse.hdf5', 'dataset_description=20230601_lidarhd_pacasam_dataset', 'datamodule.tile_width=50', 'experiment=RandLaNet_base_run_FR-MultiGPU', 'logger.comet.experiment_name=20230727_75km2_diverse-2GPUS', 'trainer.gpus=[0,1]', 'trainer.min_epochs=300', 'trainer.max_epochs=300']
Traceback (most recent call last):
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 685, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 777, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1199, in _run
    self._dispatch()
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1279, in _dispatch
    self.training_type_plugin.start_training(self)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 202, in start_training
    self._results = trainer.run_stage()
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1289, in run_stage
    return self._run_train()
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1319, in _run_train
    self.fit_loop.run()
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/base.py", line 145, in run
    self.advance(*args, **kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/fit_loop.py", line 234, in advance
    self.epoch_loop.run(data_fetcher)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/base.py", line 145, in run
    self.advance(*args, **kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 193, in advance
    batch_output = self.batch_loop.run(batch, batch_idx)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/base.py", line 145, in run
    self.advance(*args, **kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/batch/training_batch_loop.py", line 88, in advance
    outputs = self.optimizer_loop.run(split_batch, optimizers, batch_idx)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/base.py", line 145, in run
    self.advance(*args, **kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 215, in advance
    result = self._run_optimization(
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 266, in _run_optimization
    self._optimizer_step(optimizer, opt_idx, batch_idx, closure)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 378, in _optimizer_step
    lightning_module.optimizer_step(
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/core/lightning.py", line 1652, in optimizer_step
    optimizer.step(closure=optimizer_closure)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/core/optimizer.py", line 164, in step
    trainer.accelerator.optimizer_step(self._optimizer, self._optimizer_idx, closure, **kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/accelerators/accelerator.py", line 336, in optimizer_step
    self.precision_plugin.optimizer_step(model, optimizer, opt_idx, closure, **kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 163, in optimizer_step
    optimizer.step(closure=closure, **kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/optim/optimizer.py", line 88, in wrapper
    return func(*args, **kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/optim/adam.py", line 100, in step
    loss = closure()
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 148, in _wrap_closure
    closure_result = closure()
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 160, in __call__
    self._result = self.closure(*args, **kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 142, in closure
    step_output = self._step_fn()
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 435, in _training_step
    training_step_output = self.trainer.accelerator.training_step(step_kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/accelerators/accelerator.py", line 216, in training_step
    return self.training_type_plugin.training_step(*step_kwargs.values())
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/plugins/training_type/ddp.py", line 439, in training_step
    return self.model(*args, **kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/parallel/distributed.py", line 963, in forward
    output = self.module(*inputs[0], **kwargs[0])
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/overrides/base.py", line 81, in forward
    output = self.module.training_step(*inputs, **kwargs)
  File "/home/CGaydon/repositories/myria3d/myria3d/models/model.py", line 139, in training_step
    targets, logits = self.forward(batch)
  File "/home/CGaydon/repositories/myria3d/myria3d/models/model.py", line 93, in forward
    logits = self.model(batch.x, batch.pos, batch.batch, batch.ptr)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/CGaydon/repositories/myria3d/myria3d/models/modules/pyg_randla_net.py", line 73, in forward
    self.mlp_summit(b4_out_decimated[0]),
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch_geometric/nn/models/mlp.py", line 186, in forward
    x = norm(x)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch_geometric/nn/norm/batch_norm.py", line 45, in forward
    return self.module(x)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/modules/batchnorm.py", line 168, in forward
    return F.batch_norm(
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/functional.py", line 2419, in batch_norm
    _verify_batch_size(input.size())
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/functional.py", line 2387, in _verify_batch_size
    raise ValueError("Expected more than 1 value per channel when training, got input size {}".format(size))
ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 512])

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/CGaydon/repositories/myria3d/run.py", line 121, in <module>
    launch_train()
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/hydra/main.py", line 48, in decorated_main
    _run_hydra(
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/hydra/_internal/utils.py", line 377, in _run_hydra
    run_and_report(
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/hydra/_internal/utils.py", line 214, in run_and_report
    raise ex
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/hydra/_internal/utils.py", line 211, in run_and_report
    return func()
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/hydra/_internal/utils.py", line 378, in <lambda>
    lambda: hydra.run(
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/hydra/_internal/hydra.py", line 111, in run
    _ = ret.return_value
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/hydra/core/utils.py", line 233, in return_value
    raise self._return_value
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/hydra/core/utils.py", line 160, in run_job
    ret.return_value = task_function(task_cfg)
  File "/home/CGaydon/repositories/myria3d/run.py", line 57, in launch_train
    return train(config)
  File "/home/CGaydon/repositories/myria3d/myria3d/train.py", line 143, in train
    trainer.fit(model=model, datamodule=datamodule, ckpt_path=config.model.ckpt_path)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 740, in fit
    self._call_and_handle_interrupt(
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 698, in _call_and_handle_interrupt
    self.training_type_plugin.reconciliate_processes(traceback.format_exc())
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/plugins/training_type/ddp.py", line 533, in reconciliate_processes
    raise DeadlockDetectedException(f"DeadLock detected from rank: {self.global_rank} \n {trace}")
pytorch_lightning.utilities.exceptions.DeadlockDetectedException: DeadLock detected from rank: 1 
 Traceback (most recent call last):
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 685, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 777, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1199, in _run
    self._dispatch()
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1279, in _dispatch
    self.training_type_plugin.start_training(self)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 202, in start_training
    self._results = trainer.run_stage()
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1289, in run_stage
    return self._run_train()
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1319, in _run_train
    self.fit_loop.run()
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/base.py", line 145, in run
    self.advance(*args, **kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/fit_loop.py", line 234, in advance
    self.epoch_loop.run(data_fetcher)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/base.py", line 145, in run
    self.advance(*args, **kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 193, in advance
    batch_output = self.batch_loop.run(batch, batch_idx)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/base.py", line 145, in run
    self.advance(*args, **kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/batch/training_batch_loop.py", line 88, in advance
    outputs = self.optimizer_loop.run(split_batch, optimizers, batch_idx)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/base.py", line 145, in run
    self.advance(*args, **kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 215, in advance
    result = self._run_optimization(
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 266, in _run_optimization
    self._optimizer_step(optimizer, opt_idx, batch_idx, closure)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 378, in _optimizer_step
    lightning_module.optimizer_step(
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/core/lightning.py", line 1652, in optimizer_step
    optimizer.step(closure=optimizer_closure)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/core/optimizer.py", line 164, in step
    trainer.accelerator.optimizer_step(self._optimizer, self._optimizer_idx, closure, **kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/accelerators/accelerator.py", line 336, in optimizer_step
    self.precision_plugin.optimizer_step(model, optimizer, opt_idx, closure, **kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 163, in optimizer_step
    optimizer.step(closure=closure, **kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/optim/optimizer.py", line 88, in wrapper
    return func(*args, **kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/optim/adam.py", line 100, in step
    loss = closure()
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 148, in _wrap_closure
    closure_result = closure()
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 160, in __call__
    self._result = self.closure(*args, **kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 142, in closure
    step_output = self._step_fn()
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 435, in _training_step
    training_step_output = self.trainer.accelerator.training_step(step_kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/accelerators/accelerator.py", line 216, in training_step
    return self.training_type_plugin.training_step(*step_kwargs.values())
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/plugins/training_type/ddp.py", line 439, in training_step
    return self.model(*args, **kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/parallel/distributed.py", line 963, in forward
    output = self.module(*inputs[0], **kwargs[0])
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/overrides/base.py", line 81, in forward
    output = self.module.training_step(*inputs, **kwargs)
  File "/home/CGaydon/repositories/myria3d/myria3d/models/model.py", line 139, in training_step
    targets, logits = self.forward(batch)
  File "/home/CGaydon/repositories/myria3d/myria3d/models/model.py", line 93, in forward
    logits = self.model(batch.x, batch.pos, batch.batch, batch.ptr)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/CGaydon/repositories/myria3d/myria3d/models/modules/pyg_randla_net.py", line 73, in forward
    self.mlp_summit(b4_out_decimated[0]),
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch_geometric/nn/models/mlp.py", line 186, in forward
    x = norm(x)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch_geometric/nn/norm/batch_norm.py", line 45, in forward
    return self.module(x)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/modules/batchnorm.py", line 168, in forward
    return F.batch_norm(
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/functional.py", line 2419, in batch_norm
    _verify_batch_size(input.size())
  File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/functional.py", line 2387, in _verify_batch_size
    raise ValueError("Expected more than 1 value per channel when training, got input size {}".format(size))
ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 512])
CharlesGaydon commented 11 months ago

Sujet 1 : pourquoi un seul sample dans ce batch d'apprentissage -> on ajoute drop_last à tout hasard. -> l'erreur à l'apprentissage disparait 🥳

Sujet 2 : comment conserver la fonctionnalité "Prédire un unique nuage avec un unique point", situation qui peut arriver en inférence. -> En fait on avait oublié la transform MinimumNumNodes. Celle-ci est supposée dupliquer des points pour éviter les erreurs au sein du modèle. MAIS : maintenant qu'on accepte les nuages de points ayant num_nodes=1, on rentre dans un cas limite de la fonction subsample_data qui est impliquée dans MinimumNumNodes. la condition and item.size(0) != 1: est atteinte ce qui empêche la transform d'être appliquée ! -> Pas de raison d'être de cette condition : elle avait été ajoutée à FixedPoints pour gérer un cas différent à la suite de cette discussion .

CharlesGaydon commented 9 months ago

Sujet 2 devra être réglé en supprimant la condition and item.size(0) != 1 susmentionnée.

CharlesGaydon commented 9 months ago

Pas reproductible. Rouvrir si le problème se représente.