isl-org / Open3D-ML

An extension of Open3D to address 3D Machine Learning tasks
Other
1.82k stars 313 forks source link

Checkpoint import for RandLANet and KPFCNN for s3dis not working #375

Closed jokokojote closed 2 years ago

jokokojote commented 3 years ago

I have problems using the pretrained weights for the s3dis dataset (both randlanet and kpconv).

Min. example (including downloads like given in model_zoo.md):

import open3d.ml.torch as ml3d # just switch to open3d.ml.tf for tf usage
import os
import sys
from os.path import exists, join, isfile, dirname, abspath, split

dir = os.path.dirname(os.path.realpath(__file__))

def get_torch_ckpts():
    kpconv_url = "https://storage.googleapis.com/open3d-releases/model-zoo/kpconv_s3dis_202010091238.pth"
    randlanet_url = "https://storage.googleapis.com/open3d-releases/model-zoo/randlanet_s3dis_202010091238.pth"

    ckpt_path_r = dir + "/vis_weights_{}.pth".format('randlanet_s3dis')
    if not exists(ckpt_path_r):
        cmd = "wget {} -O {}".format(randlanet_url, ckpt_path_r)
        os.system(cmd)

    ckpt_path_k = dir + "/vis_weights_{}.pth".format('kpconv_s3dis')
    if not exists(ckpt_path_k):
        cmd = "wget {} -O {}".format(kpconv_url, ckpt_path_k)
        print(cmd)
        os.system(cmd)

    return ckpt_path_r, ckpt_path_k

def get_tf_ckpts():
    kpconv_url = "https://storage.googleapis.com/open3d-releases/model-zoo/kpconv_s3dis_202010091238.zip"
    randlanet_url = "https://storage.googleapis.com/open3d-releases/model-zoo/randlanet_s3dis_202106011448utc.zip"

    ckpt_path_dir = dir + "/vis_weights_{}".format('randlanet_s3dis')
    if not exists(ckpt_path_dir):
        ckpt_path_zip = dir + "/vis_weights_{}.zip".format('randlanet_s3dis')
        cmd = "wget {} -O {}".format(randlanet_url, ckpt_path_zip)
        os.system(cmd)
        cmd = "unzip -j -o {} -d {}".format(ckpt_path_zip, ckpt_path_dir)
        os.system(cmd)
    ckpt_path_r = dir + "/vis_weights_{}/{}".format('randlanet_s3dis', 'ckpt-92')

    ckpt_path_dir = dir + "/vis_weights_{}".format('kpconv_s3dis')
    if not exists(ckpt_path_dir):
        ckpt_path_zip = dir + "/vis_weights_{}.zip".format('kpconv_s3dis')
        cmd = "wget {} -O {}".format(kpconv_url, ckpt_path_zip)
        os.system(cmd)
        cmd = "unzip -j -o {} -d {}".format(ckpt_path_zip, ckpt_path_dir)
        os.system(cmd)
    ckpt_path_k = dir + "/vis_weights_{}/{}".format('kpconv_s3dis', 'ckpt-166')

    return ckpt_path_r, ckpt_path_k

# ------------------------------

def main():

    # load pretrained weights depending on used ml framework (torch or tf)
    if("open3d.ml.torch" in sys.modules): # torch is used
        ckpt_path_r, ckpt_path_k = get_torch_ckpts()
    else: # tf is used
        ckpt_path_r, ckpt_path_k = get_tf_ckpts()

    model = ml3d.models.RandLANet(ckpt_path=ckpt_path_r)
    pipeline_r = ml3d.pipelines.SemanticSegmentation(model)
    pipeline_r.load_ckpt(model.cfg.ckpt_path)

    model = ml3d.models.KPFCNN(ckpt_path=ckpt_path_k)
    pipeline_k = ml3d.pipelines.SemanticSegmentation(model)
    pipeline_k.load_ckpt(model.cfg.ckpt_path)

if __name__ == "__main__":
    main()

Error outputs:

randlanet torch:

INFO - 2021-09-10 12:56:47,137 - semantic_segmentation - Loading checkpoint /home/pointclouduser/Documents/Open3D-ML/examples/vis_weights_randlanet_s3dis.pth
Traceback (most recent call last):
  File "/home/pointclouduser/Documents/Open3D-ML/examples/vis_pred_room.py", line 77, in <module>
    main()
  File "/home/pointclouduser/Documents/Open3D-ML/examples/vis_pred_room.py", line 70, in main
    pipeline_r.load_ckpt(model.cfg.ckpt_path)
  File "/home/pointclouduser/.pyenv/versions/3.8-dev/lib/python3.8/site-packages/open3d/_ml3d/torch/pipelines/semantic_segmentation.py", line 515, in load_ckpt
    self.model.load_state_dict(ckpt['model_state_dict'])
  File "/home/pointclouduser/.pyenv/versions/3.8-dev/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for RandLANet:
    Unexpected key(s) in state_dict: "Encoder_layer_4mlp1.biases", "Encoder_layer_4mlp1.weights", "Encoder_layer_4mlp1.conv.weight", "Encoder_layer_4mlp1.conv.bias", "Encoder_layer_4mlp1.batch_normalization.weight", "Encoder_layer_4mlp1.batch_normalization.bias", "Encoder_layer_4mlp1.batch_normalization.running_mean", "Encoder_layer_4mlp1.batch_normalization.running_var", "Encoder_layer_4mlp1.batch_normalization.num_batches_tracked", "Encoder_layer_4LFAmlp1.biases", "Encoder_layer_4LFAmlp1.weights", "Encoder_layer_4LFAmlp1.conv.weight", "Encoder_layer_4LFAmlp1.conv.bias", "Encoder_layer_4LFAmlp1.batch_normalization.weight", "Encoder_layer_4LFAmlp1.batch_normalization.bias", "Encoder_layer_4LFAmlp1.batch_normalization.running_mean", "Encoder_layer_4LFAmlp1.batch_normalization.running_var", "Encoder_layer_4LFAmlp1.batch_normalization.num_batches_tracked", "Encoder_layer_4LFAatt_pooling_1fc.weight", "Encoder_layer_4LFAatt_pooling_1fc.bias", "Encoder_layer_4LFAatt_pooling_1mlp.biases", "Encoder_layer_4LFAatt_pooling_1mlp.weights", "Encoder_layer_4LFAatt_pooling_1mlp.conv.weight", "Encoder_layer_4LFAatt_pooling_1mlp.conv.bias", "Encoder_layer_4LFAatt_pooling_1mlp.batch_normalization.weight", "Encoder_layer_4LFAatt_pooling_1mlp.batch_normalization.bias", "Encoder_layer_4LFAatt_pooling_1mlp.batch_normalization.running_mean", "Encoder_layer_4LFAatt_pooling_1mlp.batch_normalization.running_var", "Encoder_layer_4LFAatt_pooling_1mlp.batch_normalization.num_batches_tracked", "Encoder_layer_4LFAmlp2.biases", "Encoder_layer_4LFAmlp2.weights", "Encoder_layer_4LFAmlp2.conv.weight", "Encoder_layer_4LFAmlp2.conv.bias", "Encoder_layer_4LFAmlp2.batch_normalization.weight", "Encoder_layer_4LFAmlp2.batch_normalization.bias", "Encoder_layer_4LFAmlp2.batch_normalization.running_mean", "Encoder_layer_4LFAmlp2.batch_normalization.running_var", "Encoder_layer_4LFAmlp2.batch_normalization.num_batches_tracked", "Encoder_layer_4LFAatt_pooling_2fc.weight", "Encoder_layer_4LFAatt_pooling_2fc.bias", "Encoder_layer_4LFAatt_pooling_2mlp.biases", "Encoder_layer_4LFAatt_pooling_2mlp.weights", "Encoder_layer_4LFAatt_pooling_2mlp.conv.weight", "Encoder_layer_4LFAatt_pooling_2mlp.conv.bias", "Encoder_layer_4LFAatt_pooling_2mlp.batch_normalization.weight", "Encoder_layer_4LFAatt_pooling_2mlp.batch_normalization.bias", "Encoder_layer_4LFAatt_pooling_2mlp.batch_normalization.running_mean", "Encoder_layer_4LFAatt_pooling_2mlp.batch_normalization.running_var", "Encoder_layer_4LFAatt_pooling_2mlp.batch_normalization.num_batches_tracked", "Encoder_layer_4mlp2.biases", "Encoder_layer_4mlp2.weights", "Encoder_layer_4mlp2.conv.weight", "Encoder_layer_4mlp2.conv.bias", "Encoder_layer_4mlp2.batch_normalization.weight", "Encoder_layer_4mlp2.batch_normalization.bias", "Encoder_layer_4mlp2.batch_normalization.running_mean", "Encoder_layer_4mlp2.batch_normalization.running_var", "Encoder_layer_4mlp2.batch_normalization.num_batches_tracked", "Encoder_layer_4shortcut.biases", "Encoder_layer_4shortcut.weights", "Encoder_layer_4shortcut.conv.weight", "Encoder_layer_4shortcut.conv.bias", "Encoder_layer_4shortcut.batch_normalization.weight", "Encoder_layer_4shortcut.batch_normalization.bias", "Encoder_layer_4shortcut.batch_normalization.running_mean", "Encoder_layer_4shortcut.batch_normalization.running_var", "Encoder_layer_4shortcut.batch_normalization.num_batches_tracked", "Decoder_layer_4.biases", "Decoder_layer_4.weights", "Decoder_layer_4.conv.weight", "Decoder_layer_4.conv.bias", "Decoder_layer_4.batch_normalization.weight", "Decoder_layer_4.batch_normalization.bias", "Decoder_layer_4.batch_normalization.running_mean", "Decoder_layer_4.batch_normalization.running_var", "Decoder_layer_4.batch_normalization.num_batches_tracked". 
    size mismatch for fc0.weight: copying a param with shape torch.Size([8, 6]) from checkpoint, the shape in current model is torch.Size([8, 3]).
    size mismatch for decoder_0.biases: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]).
    size mismatch for decoder_0.weights: copying a param with shape torch.Size([1024, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 512, 1, 1]).
    size mismatch for decoder_0.conv.weight: copying a param with shape torch.Size([1024, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 512, 1, 1]).
    size mismatch for decoder_0.conv.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]).
    size mismatch for decoder_0.batch_normalization.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]).
    size mismatch for decoder_0.batch_normalization.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]).
    size mismatch for decoder_0.batch_normalization.running_mean: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]).
    size mismatch for decoder_0.batch_normalization.running_var: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]).
    size mismatch for Decoder_layer_0.biases: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
    size mismatch for Decoder_layer_0.weights: copying a param with shape torch.Size([1536, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([768, 256, 1, 1]).
    size mismatch for Decoder_layer_0.conv.weight: copying a param with shape torch.Size([1536, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([768, 256, 1, 1]).
    size mismatch for Decoder_layer_0.conv.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
    size mismatch for Decoder_layer_0.batch_normalization.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
    size mismatch for Decoder_layer_0.batch_normalization.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
    size mismatch for Decoder_layer_0.batch_normalization.running_mean: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
    size mismatch for Decoder_layer_0.batch_normalization.running_var: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
    size mismatch for Decoder_layer_1.biases: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]).
    size mismatch for Decoder_layer_1.weights: copying a param with shape torch.Size([768, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 128, 1, 1]).
    size mismatch for Decoder_layer_1.conv.weight: copying a param with shape torch.Size([768, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 128, 1, 1]).
    size mismatch for Decoder_layer_1.conv.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]).
    size mismatch for Decoder_layer_1.batch_normalization.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]).
    size mismatch for Decoder_layer_1.batch_normalization.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]).
    size mismatch for Decoder_layer_1.batch_normalization.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]).
    size mismatch for Decoder_layer_1.batch_normalization.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]).
    size mismatch for Decoder_layer_2.biases: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
    size mismatch for Decoder_layer_2.weights: copying a param with shape torch.Size([384, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([160, 32, 1, 1]).
    size mismatch for Decoder_layer_2.conv.weight: copying a param with shape torch.Size([384, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([160, 32, 1, 1]).
    size mismatch for Decoder_layer_2.conv.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
    size mismatch for Decoder_layer_2.batch_normalization.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
    size mismatch for Decoder_layer_2.batch_normalization.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
    size mismatch for Decoder_layer_2.batch_normalization.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
    size mismatch for Decoder_layer_2.batch_normalization.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
    size mismatch for Decoder_layer_3.weights: copying a param with shape torch.Size([160, 32, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 32, 1, 1]).
    size mismatch for Decoder_layer_3.conv.weight: copying a param with shape torch.Size([160, 32, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 32, 1, 1]).
    size mismatch for fc.biases: copying a param with shape torch.Size([13]) from checkpoint, the shape in current model is torch.Size([19]).
    size mismatch for fc.weights: copying a param with shape torch.Size([13, 32, 1, 1]) from checkpoint, the shape in current model is torch.Size([19, 32, 1, 1]).
    size mismatch for fc.conv.weight: copying a param with shape torch.Size([13, 32, 1, 1]) from checkpoint, the shape in current model is torch.Size([19, 32, 1, 1]).
    size mismatch for fc.conv.bias: copying a param with shape torch.Size([13]) from checkpoint, the shape in current model is torch.Size([19]).

kpconv torch:

INFO - 2021-09-10 13:00:03,446 - semantic_segmentation - Loading checkpoint /home/pointclouduser/Documents/Open3D-ML/examples/vis_weights_kpconv_s3dis.pth
Traceback (most recent call last):
  File "/home/pointclouduser/Documents/Open3D-ML/examples/vis_pred_room.py", line 78, in <module>
    main()
  File "/home/pointclouduser/Documents/Open3D-ML/examples/vis_pred_room.py", line 70, in main
    pipeline_k.load_ckpt(model.cfg.ckpt_path)
  File "/home/pointclouduser/.pyenv/versions/3.8-dev/lib/python3.8/site-packages/open3d/_ml3d/torch/pipelines/semantic_segmentation.py", line 515, in load_ckpt
    self.model.load_state_dict(ckpt['model_state_dict'])
  File "/home/pointclouduser/.pyenv/versions/3.8-dev/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for KPFCNN:
    size mismatch for encoder_blocks.0.KPConv.weights: copying a param with shape torch.Size([15, 5, 64]) from checkpoint, the shape in current model is torch.Size([15, 2, 64]).
    size mismatch for head_softmax.mlp.weight: copying a param with shape torch.Size([13, 128]) from checkpoint, the shape in current model is torch.Size([19, 128]).
    size mismatch for head_softmax.batch_norm.bias: copying a param with shape torch.Size([13]) from checkpoint, the shape in current model is torch.Size([19]).

randlanet tf:

INFO - 2021-09-10 13:07:14,684 - semantic_segmentation - Restored from /home/pointclouduser/Documents/Open3D-ML/examples/vis_weights_randlanet_s3dis/ckpt-92
Traceback (most recent call last):
  File "/home/pointclouduser/Documents/Open3D-ML/examples/vis_pred_room.py", line 78, in <module>
    main()
  File "/home/pointclouduser/Documents/Open3D-ML/examples/vis_pred_room.py", line 74, in main
    pipeline_k.load_ckpt(model.cfg.ckpt_path)
  File "/home/pointclouduser/.pyenv/versions/3.8-dev/lib/python3.8/site-packages/open3d/_ml3d/tf/pipelines/semantic_segmentation.py", line 369, in load_ckpt
    self.ckpt.restore(ckpt_path).expect_partial()
  File "/home/pointclouduser/.pyenv/versions/3.8-dev/lib/python3.8/site-packages/tensorflow/python/training/tracking/util.py", line 2335, in restore
    status = self.read(save_path, options=options)
  File "/home/pointclouduser/.pyenv/versions/3.8-dev/lib/python3.8/site-packages/tensorflow/python/training/tracking/util.py", line 2220, in read
    result = self._saver.restore(save_path=save_path, options=options)
  File "/home/pointclouduser/.pyenv/versions/3.8-dev/lib/python3.8/site-packages/tensorflow/python/training/tracking/util.py", line 1382, in restore
    base.CheckpointPosition(
  File "/home/pointclouduser/.pyenv/versions/3.8-dev/lib/python3.8/site-packages/tensorflow/python/training/tracking/base.py", line 254, in restore
    restore_ops = trackable._restore_from_checkpoint_position(self)  # pylint: disable=protected-access
  File "/home/pointclouduser/.pyenv/versions/3.8-dev/lib/python3.8/site-packages/tensorflow/python/training/tracking/base.py", line 980, in _restore_from_checkpoint_position
    current_position.checkpoint.restore_saveables(
  File "/home/pointclouduser/.pyenv/versions/3.8-dev/lib/python3.8/site-packages/tensorflow/python/training/tracking/util.py", line 351, in restore_saveables
    new_restore_ops = functional_saver.MultiDeviceSaver(
  File "/home/pointclouduser/.pyenv/versions/3.8-dev/lib/python3.8/site-packages/tensorflow/python/training/saving/functional_saver.py", line 339, in restore
    restore_ops = restore_fn()
  File "/home/pointclouduser/.pyenv/versions/3.8-dev/lib/python3.8/site-packages/tensorflow/python/training/saving/functional_saver.py", line 323, in restore_fn
    restore_ops.update(saver.restore(file_prefix, options))
  File "/home/pointclouduser/.pyenv/versions/3.8-dev/lib/python3.8/site-packages/tensorflow/python/training/saving/functional_saver.py", line 115, in restore
    restore_ops[saveable.name] = saveable.restore(
  File "/home/pointclouduser/.pyenv/versions/3.8-dev/lib/python3.8/site-packages/tensorflow/python/training/saving/saveable_object_util.py", line 131, in restore
    return resource_variable_ops.shape_safe_assign_variable_handle(
  File "/home/pointclouduser/.pyenv/versions/3.8-dev/lib/python3.8/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 309, in shape_safe_assign_variable_handle
    shape.assert_is_compatible_with(value_tensor.shape)
  File "/home/pointclouduser/.pyenv/versions/3.8-dev/lib/python3.8/site-packages/tensorflow/python/framework/tensor_shape.py", line 1161, in assert_is_compatible_with
    raise ValueError("Shapes %s and %s are incompatible" % (self, other))
ValueError: Shapes (15, 2, 64) and (15, 5, 64) are incompatible

kpconv tf:

Traceback (most recent call last):
  File "/home/pointclouduser/Documents/Open3D-ML/examples/vis_pred_room.py", line 78, in <module>
    main()
  File "/home/pointclouduser/Documents/Open3D-ML/examples/vis_pred_room.py", line 70, in main
    pipeline_k.load_ckpt(model.cfg.ckpt_path)
  File "/home/pointclouduser/.pyenv/versions/3.8-dev/lib/python3.8/site-packages/open3d/_ml3d/tf/pipelines/semantic_segmentation.py", line 369, in load_ckpt
    self.ckpt.restore(ckpt_path).expect_partial()
  File "/home/pointclouduser/.pyenv/versions/3.8-dev/lib/python3.8/site-packages/tensorflow/python/training/tracking/util.py", line 2335, in restore
    status = self.read(save_path, options=options)
  File "/home/pointclouduser/.pyenv/versions/3.8-dev/lib/python3.8/site-packages/tensorflow/python/training/tracking/util.py", line 2220, in read
    result = self._saver.restore(save_path=save_path, options=options)
  File "/home/pointclouduser/.pyenv/versions/3.8-dev/lib/python3.8/site-packages/tensorflow/python/training/tracking/util.py", line 1382, in restore
    base.CheckpointPosition(
  File "/home/pointclouduser/.pyenv/versions/3.8-dev/lib/python3.8/site-packages/tensorflow/python/training/tracking/base.py", line 254, in restore
    restore_ops = trackable._restore_from_checkpoint_position(self)  # pylint: disable=protected-access
  File "/home/pointclouduser/.pyenv/versions/3.8-dev/lib/python3.8/site-packages/tensorflow/python/training/tracking/base.py", line 980, in _restore_from_checkpoint_position
    current_position.checkpoint.restore_saveables(
  File "/home/pointclouduser/.pyenv/versions/3.8-dev/lib/python3.8/site-packages/tensorflow/python/training/tracking/util.py", line 351, in restore_saveables
    new_restore_ops = functional_saver.MultiDeviceSaver(
  File "/home/pointclouduser/.pyenv/versions/3.8-dev/lib/python3.8/site-packages/tensorflow/python/training/saving/functional_saver.py", line 339, in restore
    restore_ops = restore_fn()
  File "/home/pointclouduser/.pyenv/versions/3.8-dev/lib/python3.8/site-packages/tensorflow/python/training/saving/functional_saver.py", line 323, in restore_fn
    restore_ops.update(saver.restore(file_prefix, options))
  File "/home/pointclouduser/.pyenv/versions/3.8-dev/lib/python3.8/site-packages/tensorflow/python/training/saving/functional_saver.py", line 115, in restore
    restore_ops[saveable.name] = saveable.restore(
  File "/home/pointclouduser/.pyenv/versions/3.8-dev/lib/python3.8/site-packages/tensorflow/python/training/saving/saveable_object_util.py", line 131, in restore
    return resource_variable_ops.shape_safe_assign_variable_handle(
  File "/home/pointclouduser/.pyenv/versions/3.8-dev/lib/python3.8/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 309, in shape_safe_assign_variable_handle
    shape.assert_is_compatible_with(value_tensor.shape)
  File "/home/pointclouduser/.pyenv/versions/3.8-dev/lib/python3.8/site-packages/tensorflow/python/framework/tensor_shape.py", line 1161, in assert_is_compatible_with
    raise ValueError("Shapes %s and %s are incompatible" % (self, other))
ValueError: Shapes (15, 2, 64) and (15, 5, 64) are incompatible

Did I mess up the files or this there something wrong with the checkpoints?

kukuruza commented 3 years ago

I'm not a contributor, but why do you load checkpoints to your models twice, moreover from different locations:

    model = ml3d.models.KPFCNN(ckpt_path=ckpt_path_k)  # This is the one that was downloaded.
    ...
    pipeline_k.load_ckpt(model.cfg.ckpt_path)   # Maybe ckpt_path_k should go here?

Also, there is another issue on trained models size mismatch: https://github.com/isl-org/Open3D-ML/issues/364

jokokojote commented 3 years ago

@kukuruza thanks for your suggestion.

I understand what you mean, but it is done the same way in examples/vis_pred.py, isn‘t it?

But you are right, #364 seems to be the same problem. My fault.

sanskar107 commented 2 years ago

@jokokojote Thankyou for reporting this issue. Let me try to reproduce and get back to you.

sanskar107 commented 2 years ago

@jokokojote I cannot reproduce this issue with the default config. You need to use config files present in ml3d/configs/randlanet_s3dis.yml for network parameters. A quick and better way to train/test your model is through our predefined script. Here https://github.com/isl-org/Open3D-ML#using-predefined-scripts

sanskar107 commented 2 years ago

Closing due to inactivity.