Closed bw4sz closed 2 years ago
Hi @bw4sz I love that! I know we have a couple of people interested in that sort of things in the community so I'll ping them. The starting point is to create a custom dataset together with a collate transform, once that's in place you should be able to port your model over pretty easily. Let's wait to see if someone has already built something around that and if not I can give more guidance. Thanks for the interest!
Hi,
We are currently working on a 2D+3D joint model based on sparse-conv. The context is a bit different than yours since we are working with indoor/outdoor datasets, in which the image projection is as straightforward as for aerial images. However I am sure it could be adapted to this simpler case easily. In short, our new data format needs a mapping from point to pixels, which should be pretty straightforward to obtain.
The focus of your dataset is instance segmentation of trees, correct?
yup. I was just starting to format a dataset object for this repo. @nicolas-chaulet let me know if there is a preferred input type.
I converted our .las LiDAR files to a headerless txt with the format
X,Y,Z,Intensity, Label
where the Label is a integer 1....n, for each individual tree. On first scan here
It wasn't super obvious where the labels go in the parser, but I didn't yet look at the detection/ folder examples. I can make a separate issue if you prefer, but we can work through that as an example and then I can do a pull request to add a bit of demo.
@loicland I remember your gated messenger networks paper with the superpoints group. The major difference with these datasets is the weaker annotation completeness (annotations are projected from RGB) and lower point density. Here is an example from my semi-supervised RGB retinanet, projected into the point cloud. No geometric learning yet.
Happy to contribute in anyway, i'm glad to see the community start to coalesce. I am transitioning out of tensorflow/keras for this reason.
@nicolas-chaulet I can open other issues, but returning to this today. I'm not 100% sure I understand the API model. Let's say I have 100+ 'scans' of a plot, like the image above, in a folder, each in a .csv.
e.g.
BLAN_009.txt
"X","Y","Z","Intensity","label"
763156.94,4330837.93,3.37,0,0
763156.22,4330838.09,3.58,0,0
763155.49,4330838.25,3.72,0,0
763154.82,4330838.39,4.04,0,0
where label is unique object identifier (integer). Each row is a point. Is this the correct format for object detection?
Can you confirm the desired workflow?
poetry run python train.py task=object_detection model_name=votenet dataset=myNewDataset
but then the repo is also a python package? Maybe just if users want the datasets?
import torch
from torch_geometric.data import InMemoryDataset
import glob
import pandas as pd
class Crowns(InMemoryDataset):
def __init__(self, root, transform=None, pre_transform=None):
super(Crowns, self).__init__(root, transform, pre_transform)
self.root = root
self.data, self.slices = torch.load(self.processed_file_names[0])
@property
def raw_file_names(self):
return glob.glob("{}/*.txt".format(self.root))
@property
def processed_file_names(self):
return ['data.pt']
def download(self):
# Download to `self.raw_dir`.
pass
def read_plot(self, plotID):
"""Read a scan from the master list"""
df = pd.read_csv("{}/{}".format(self.root,plotID))
return df
def process(self):
# Read data into huge `Data` list.
data_list = [self.read_plot(x) for x in self.raw_file_names()]
if self.pre_filter is not None:
data_list = [data for data in data_list if self.pre_filter(data)]
if self.pre_transform is not None:
data_list = [self.pre_transform(data) for data in data_list]
data, slices = self.collate(data_list)
torch.save((data, slices), self.processed_file_names[0])
So, it depends if you have some training scripts already. If not, then fork + using the train.py is the best option.
The dataset you have posted looks good, you would have to wrap you dataframes into a ptyorch geometric Data
object though with pos
set to (x,y,z) and x
to intensity
. For object detection and for votenet in particular you need to work some more to extract bounding boxes and other things, take a look at scannet there for an example:
https://github.com/nicolas-chaulet/torch-points3d/blob/master/torch_points3d/datasets/object_detection/scannet.py
The list of required labels is declared there:
https://github.com/nicolas-chaulet/torch-points3d/blob/9966f2350e03165158ba40b9f203aae7a16d31aa/torch_points3d/models/object_detection/votenet.py#L22
Once that is done, you would just have to wrap that into a tp3d BaseDataset
to define your train vs val split. A BaseDataset
will handle batching for you and create the data loaders automatically. It also handles the logic for creating the data augmentation transforms that you might have defined in you myNewDataset.yaml
file.
I hope this makes sense!
I'm going to close for now, let me know of any updates!
Hi all, thanks for your great work. I'm a researcher at University of Florida studying deep learning for biological applications (this kind of stuff: https://deepforest.readthedocs.io/). I like the paradigm you've put forward. This is the kind of thing that motivated me to make the switch from tensorflow.
The paper says
What is the status of multi-modal models? What models/datasets do you hope to use? I've got a benchmark dataset for tree detection in LiDAR + Hyperspectral + RGB that i'm publishing (https://github.com/weecology/NeonTreeEvaluation, https://www.biorxiv.org/content/10.1101/2020.11.16.385088v1). I'm starting to build joint models and wanted to ask how I can help contribute and get involved here. If you set me out some general guidelines and status I'd love to contribute for more reproducibility.
I'm just checking out the repo now and I'll train a couple models to see it generalizes to tree detection. Our point density is very limited compared to traditional benchmarks. See some samples: http://tree.westus.cloudapp.azure.com/trees/