CCInc / 3d-ml

A versatile framework for 3D machine learning built on Pytorch Lightning and Hydra [looking for contributors!]
15 stars 3 forks source link

Rewrite Modelnet2048 using PyG InMemoryDataset #17

Open CCInc opened 2 years ago

CCInc commented 2 years ago

Currently, custom code is written to handle the downloading/processing of modelnet2048. It should be rewritten in the context of a pytorch geometric InMemoryDataset, which has helper functions to handle downloading and processing of the data from the h5py input format (and removing the custom implemented download functions)

See: https://pytorch-geometric.readthedocs.io/en/latest/notes/create_dataset.html https://github.com/pyg-team/pytorch_geometric/blob/master/torch_geometric/datasets/s3dis.py <- the pyg s3dis dataset also comes from a h5py source, very similar to modelnet2048

CCInc commented 2 years ago

@Stakhan mind taking a look at this?

Stakhan commented 2 years ago

Sure!

Stakhan commented 2 years ago

I'm a bit confused with the name. From the dataset webpage it seems that there is only:

Isn't it in fact ModelNet40?

CCInc commented 2 years ago

This is "ModelNet40", but it's a presampled version that has 2048 points per sample (more commonly found on point cloud tasks). This is in contrast to the original ModelNet40, which are CAD models (i.e. they need to have their surfaces sampled before running them through a point cloud task).

Stakhan commented 2 years ago

Okay. Thank you for the clarification.

CCInc commented 2 years ago

It will be good to document all of this at some point for each dataset, since there are many different versions used by different papers. it can be quite confusing.