charlesq34 / pointnet

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation
Other
4.76k stars 1.45k forks source link

prepare my own data #100

Open iWeisskohl opened 6 years ago

iWeisskohl commented 6 years ago

@charlesq34 Hi,Charles, I am trying to use pointNet with my own data, but I got confused about the data preparing step with the code you offered and can't transfer my own ply data to hdf5. Could u please release some demo for data preparing. Appreciate!

IsaacGuan commented 6 years ago

Hi, I think you could refer to the issue I previously opened: https://github.com/charlesq34/pointnet/issues/94 I simply used the plyfile tool to fetch the points from .ply files as numpy array, and then write these arrays into HDF5 file. You could also refer to my code: https://github.com/IsaacGuan/PointNet-Plane-Detection/blob/master/data/write_hdf5.py

cjw123 commented 6 years ago

Thank you for your reply! I don't understand what the 0, 1 in the .seg file stands for?

IsaacGuan commented 6 years ago

Hi, the number simply shows the part segmentation of an object. In my case, 0 means the point is in non-planar part, whereas 1 means the corresponding point is in the plane.

romanskie commented 6 years ago

Hey @IsaacGuan,

I am trying to use my own data right know as well and you have made some progress here :)

I got a pretty basic question concerning the h5 data files used for part_seg. Does one file e.g. "ply_data_train1.h5" only represent one single point cloud (e.g. only one plane) with 2048 points? That would mean that the part_seg is only trained with six point clouds at all (because there are only 6 h5 available). Or am I missing some critical aspects here?

Thank you so much in advance!

IsaacGuan commented 6 years ago

Hi @romanskie, An HDF5 file contains a bunch of point clouds. Let's take the following picture as an example, each row of the dataset represents a point cloud with 2048 points, the point coordinates are stored in the cells. screenshot from 2018-05-11 12-01-55 Concretely speaking, when writing an HDF5 file, we first read the raw data from the N point clouds as a 3D array (Nx2048x3), then write the array to the HDF5 dataset.

romanskie commented 6 years ago

Hey @IsaacGuan ,

thank you so much for your help! It's really hard for me to understand the whole data structure!

I got a point cloud per row in my "data" data set of dimension 2048x2048 right?

The "label" data set is of dimension 2048 x 1, so it means there is one category per point cloud => one of the 16 shapenet categories like e.g. plane?

The "pid" is again a data set of dimension 2048 x 2048 which maps category_id and part_id to each point (for the plane example there are 4 part ids for the category plane). So here every cell is a specific point mapping?

Part ids for the plane example: 0: Color(0.65, 0.95, 0.05) --> Seg(Airplane, 1) 1: Color(0.35, 0.05, 0.35) --> Seg(Airplane, 2) 2: Color(0.65, 0.35, 0.65) --> Seg(Airplane, 3) 3: Color(0.95, 0.95, 0.65) --> Seg(Airplane, 4)

Thank you so much in advance!

IsaacGuan commented 6 years ago

Hi, @romanskie, You are right. The label dataset maps each point cloud to a certain shape category, while pid maps each point to a part id. If you are doing object shape classification, pid is unnecessary to provide.

Yaara1 commented 5 years ago

I am also trying to prepare data for training. I am new with HDF5 files. How do I write the data and labels to the same file? (If it matters, I am using matlab for creating and writing the files since i generated the data using matlab) am i supposed to define 2 different data sets in the same file? how should it be done?

pournami123 commented 5 years ago

Hi, @IsaacGuan .Even I do have the same question as Yaara1. How can i prepare my own training data

Yaara1 commented 5 years ago

@rejivipin I already prepared my data, so i hope this would help. I used the PointNet only for classification and didn't use normals of an object, only the x,y,z of points on the surface of the objects.

This PointNet code uses the ModelNet40 files for classification. It takes the ModelNet40 files which are in .off format, generates a .h5 files from it and then use the .h5 files to feed the network directly. I found that it is comfortable for me to generate the .h5 files. I did it using python, though I generated the data using matlab. Here are 2 functions I wrote and used to import matlab files and to create h5 files:

` import h5py

data from .mat file to h5

def load_matlab_data(fullfilename, pc_variable_name, label_variable_name): matlab_mat = sio.loadmat(fullfilename) pc = matlab_mat[pc_variable_name] label = matlab_mat[label_variable_name] return pc, label

def save_h5(filename, pc, label, partial_path): path = partial_path + filename + '.hdf5' with h5py.File(filename, 'w') as f: f.create_dataset("data", data=pc) f.create_dataset("label", data=label) print(f'file {filename}.hdf5 was created in project directory, in {partial_path}') return `

I recommend you take a few minutes to read about HDF5 files, just to understand it generally. As you can see in the code I wrote, the .h5 files should contain 2 datasets - one for the data itself and one for the lables. In the "data" dataset each row (and its depth) is one point cloud. meaning the "data" dataset has dimensions NxPx3 while N is the number of point clouds in the file, P is the number of points in a single point cloud, and 3 is because each point has x,y,z coordinates. It can be helpful to have a HDF file viewer, so you can easily open .HDF5 or .h5 files and see its structure and weather you wrote the file correctly. I used HDFView (it is free).

In the train.py file, in lines 62-65, the location of the files (train and test) are detailed. This is actually the location of a .txt file that should contain a list of the file names of the train/test files respectively. You should have that .txt file in the location specified in those lines, or you can change the location/name of file.

I recommend that you first try to use the network with the ModelNet40 files. In this process i think you will understand what files you need in which locations and which formats, and how to modify it in the PointNet code, if needed. This is what I did and it helped a lot. If i remember correct, start by trying to run the train.py file, it uses the provider.py file to download the ModelNet40 and create the .h5 files.

pournami123 commented 5 years ago

Hi Yaara1, Thanks for your quick response. But one thing I missed in my question is that I am using .las format LiDAR point cloud data. Sorry about that. So can you please guide me on how to prepare training data.

On Wed, Jan 2, 2019 at 8:07 PM Yaara1 notifications@github.com wrote:

@rejivipin https://github.com/rejivipin I already prepared my data, so i hope this would help: This PointNet code uses the ModelNet40 files for classification. It takes the ModelNet40 files which are in .off format, generates a .h5 files from it and then use the .h5 files to feed the network directly. I found that it is comfortable for me to generate the .h5 files. I did it using python, though I generated the data using matlab. Here are 2 functions I wrote and used to import matlab files and to create h5 files:

`import time import h5py

----------------------------------

data from .mat file to h5

----------------------------------

def load_matlab_data(fullfilename, pc_variable_name, label_variable_name): matlab_mat = sio.loadmat(fullfilename) pc = matlab_mat[pc_variable_name] label = matlab_mat[label_variable_name] return pc, label

def save_h5(filename, pc, label, partial_path): path = partial_path + filename + '.hdf5' with h5py.File(filename, 'w') as f: f.create_dataset("data", data=pc) f.create_dataset("label", data=label) print(f'file {filename}.hdf5 was created in project directory, in {partial_path}') return `

I recommend you take a few minutes to read about HDF5 files, just to understand it generally. As you can see in the code I wrote, the .h5 files should contain 2 datasets - one for the data itself and one for the lables. In the "data" dataset each row (and its depth) is one point cloud. meaning the "data" dataset has dimensions NxPx3 while N is the number of point clouds in the file, P is the number of points in a single point cloud, and 3 is because each point has x,y,z coordinates. It can be helpful to have a HDF file viewer, so you can easily open .HDF5 or .h5 files and see its structure and weather you wrote the file correctly. I used HDFView (it is free).

In the train.py file, in lines 62-65, the location of the files (train and test) are detailed. This is actually the location of a .txt file that should contain a list of the file names of the train/test files respectively. You should have that .txt file in the location specified in those lines, or you can change the location/name of file.

I recommend that you first try to use the network with the ModelNet40 files. In this process i think you will understand what files you need in which locations and which formats, and how to modify it in the PointNet code, if needed. This is what I did and it helped a lot. If i remember correct, start by trying to run the train.py file, it uses the provider.py file to download the ModelNet40 and create the .h5 files.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/charlesq34/pointnet/issues/100#issuecomment-450879066, or mute the thread https://github.com/notifications/unsubscribe-auth/AfhISaXRJV3sbIGHYW3SfwxnvZcGkLOIks5u_MRCgaJpZM4TdlcI .

majidnasr commented 5 years ago

Hi Yaara1, Thanks for your quick response. But one thing I missed in my question is that I am using .las format LiDAR point cloud data. Sorry about that. So can you please guide me on how to prepare training data. On Wed, Jan 2, 2019 at 8:07 PM Yaara1 @.***> wrote: @rejivipin https://github.com/rejivipin I already prepared my data, so i hope this would help: This PointNet code uses the ModelNet40 files for classification. It takes the ModelNet40 files which are in .off format, generates a .h5 files from it and then use the .h5 files to feed the network directly. I found that it is comfortable for me to generate the .h5 files. I did it using python, though I generated the data using matlab. Here are 2 functions I wrote and used to import matlab files and to create h5 files: import time import h5py #---------------------------------- data from .mat file to h5 #---------------------------------- def load_matlab_data(fullfilename, pc_variable_name, label_variable_name): matlab_mat = sio.loadmat(fullfilename) pc = matlab_mat[pc_variable_name] label = matlab_mat[label_variable_name] return pc, label def save_h5(filename, pc, label, partial_path): path = partial_path + filename + '.hdf5' with h5py.File(filename, 'w') as f: f.create_dataset("data", data=pc) f.create_dataset("label", data=label) print(f'file {filename}.hdf5 was created in project directory, in {partial_path}') return I recommend you take a few minutes to read about HDF5 files, just to understand it generally. As you can see in the code I wrote, the .h5 files should contain 2 datasets - one for the data itself and one for the lables. In the "data" dataset each row (and its depth) is one point cloud. meaning the "data" dataset has dimensions NxPx3 while N is the number of point clouds in the file, P is the number of points in a single point cloud, and 3 is because each point has x,y,z coordinates. It can be helpful to have a HDF file viewer, so you can easily open .HDF5 or .h5 files and see its structure and weather you wrote the file correctly. I used HDFView (it is free). In the train.py file, in lines 62-65, the location of the files (train and test) are detailed. This is actually the location of a .txt file that should contain a list of the file names of the train/test files respectively. You should have that .txt file in the location specified in those lines, or you can change the location/name of file. I recommend that you first try to use the network with the ModelNet40 files. In this process i think you will understand what files you need in which locations and which formats, and how to modify it in the PointNet code, if needed. This is what I did and it helped a lot. If i remember correct, start by trying to run the train.py file, it uses the provider.py file to download the ModelNet40 and create the .h5 files. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#100 (comment)>, or mute the thread https://github.com/notifications/unsubscribe-auth/AfhISaXRJV3sbIGHYW3SfwxnvZcGkLOIks5u_MRCgaJpZM4TdlcI .

@rejivipin I prepared h5 files from LiDAR point clouds. I converted las files to txt files, using CloudCompare software, then prepared numpy files using collect_indoor3d_data.py in the semantic segmentation part (sem_seg), then prepared h5 files using gen_indoor3d_h5.py, also in the semantic segmentation part. I wish this would help you

pournami123 commented 5 years ago

@majidnasr , Thanks for your reply. I got what you are saying. One more doubt I have about the data is that- While preparing the training data- say I have soil, building, and vegetation. If that is the case then for each class(say for example soil) How many training sample files you prepared? How the sampling was done? Thanks in advance... I am eagerly waiting for your reply.

majidnasr commented 5 years ago

Hi @rejivipin In my case, the whole point cloud is split into 7 major parts as building 1, 2,3.. and each building is split into almost 20 parts roughly 3mX4m in surface area (I studied on the facade of buildings and bridges). I annotated each part, like the method of S3DIS dataset annotation. The number/size of annotated classes in each part does not effect on the process. While preparing h5 files using gen_indoor3d_h5.py code (in sem_seg folder of the repository) the point cloud wraps into 1mX1m blocks on XY surface of the point cloud(line 76 of gen_indoor3d_h5.py). So, in your case, if you select all the soil points as one segment or several segments (in each part) the code would do its own job in wrapping into defined blocks and attach all the soil/building/vegetation points together in each block. the overfitting/underfitting issue is another story (and is about the number of points of each class), but the number of annotated segments in each part is not important for pointnet. I wish this would help you.

pournami123 commented 5 years ago

@majidnasr , Thank you.. I will try that way.

Anubhople commented 5 years ago

I have used this code (https://github.com/IsaacGuan/PointNet-Plane-Detection/blob/master/data/write_hdf5.py) for converting my .ply files to .h5 code runs successfully but I want my .h5 files should contain both data and labels (labels for each shape) how to solve this problem? ?

SMohammadi89 commented 5 years ago

Hi every one , I want to use my own data in Point Net network to reconstruct and to have the segmented point cloud ,my question is that, do I need to train network again for new sample or I just need to test the new data and if I need to train the network how many samples do I need for the new part. I would be really happy if someone can answer me. Thank you in advance

thirdlastletter commented 5 years ago

@rejivipin I already prepared my data, so i hope this would help. I used the PointNet only for classification and didn't use normals of an object, only the x,y,z of points on the surface of the objects.

This PointNet code uses the ModelNet40 files for classification. It takes the ModelNet40 files which are in .off format, generates a .h5 files from it and then use the .h5 files to feed the network directly. I found that it is comfortable for me to generate the .h5 files. I did it using python, though I generated the data using matlab. Here are 2 functions I wrote and used to import matlab files and to create h5 files:

` import h5py

data from .mat file to h5

def load_matlab_data(fullfilename, pc_variable_name, label_variable_name): matlab_mat = sio.loadmat(fullfilename) pc = matlab_mat[pc_variable_name] label = matlab_mat[label_variable_name] return pc, label

def save_h5(filename, pc, label, partial_path): path = partial_path + filename + '.hdf5' with h5py.File(filename, 'w') as f: f.create_dataset("data", data=pc) f.create_dataset("label", data=label) print(f'file {filename}.hdf5 was created in project directory, in {partial_path}') return `

I recommend you take a few minutes to read about HDF5 files, just to understand it generally. As you can see in the code I wrote, the .h5 files should contain 2 datasets - one for the data itself and one for the lables. In the "data" dataset each row (and its depth) is one point cloud. meaning the "data" dataset has dimensions NxPx3 while N is the number of point clouds in the file, P is the number of points in a single point cloud, and 3 is because each point has x,y,z coordinates. It can be helpful to have a HDF file viewer, so you can easily open .HDF5 or .h5 files and see its structure and weather you wrote the file correctly. I used HDFView (it is free).

In the train.py file, in lines 62-65, the location of the files (train and test) are detailed. This is actually the location of a .txt file that should contain a list of the file names of the train/test files respectively. You should have that .txt file in the location specified in those lines, or you can change the location/name of file.

I recommend that you first try to use the network with the ModelNet40 files. In this process i think you will understand what files you need in which locations and which formats, and how to modify it in the PointNet code, if needed. This is what I did and it helped a lot. If i remember correct, start by trying to run the train.py file, it uses the provider.py file to download the ModelNet40 and create the .h5 files.

Thank you for your information. I am not 100% sure I understood correctly. So my .h5 file has 2 datasets: one for the data and one for the labels. So if my labels are either 0 or 1 (1 dimensional), the data dataset would be NxPx3 and my label dataset would be NxPx1, am I correct? In the label dataset I would just write the correct label at the same position as the data in the data dataset?

And all point clouds need to have the same amount of points, am I correct? (so I would need to sample my data)

Thank you very much!

Yaara1 commented 5 years ago

@rejivipin I already prepared my data, so i hope this would help. I used the PointNet only for classification and didn't use normals of an object, only the x,y,z of points on the surface of the objects. This PointNet code uses the ModelNet40 files for classification. It takes the ModelNet40 files which are in .off format, generates a .h5 files from it and then use the .h5 files to feed the network directly. I found that it is comfortable for me to generate the .h5 files. I did it using python, though I generated the data using matlab. Here are 2 functions I wrote and used to import matlab files and to create h5 files: ` import h5py

data from .mat file to h5

def load_matlab_data(fullfilename, pc_variable_name, label_variable_name): matlab_mat = sio.loadmat(fullfilename) pc = matlab_mat[pc_variable_name] label = matlab_mat[label_variable_name] return pc, label def save_h5(filename, pc, label, partial_path): path = partial_path + filename + '.hdf5' with h5py.File(filename, 'w') as f: f.create_dataset("data", data=pc) f.create_dataset("label", data=label) print(f'file {filename}.hdf5 was created in project directory, in {partial_path}') return ` I recommend you take a few minutes to read about HDF5 files, just to understand it generally. As you can see in the code I wrote, the .h5 files should contain 2 datasets - one for the data itself and one for the lables. In the "data" dataset each row (and its depth) is one point cloud. meaning the "data" dataset has dimensions NxPx3 while N is the number of point clouds in the file, P is the number of points in a single point cloud, and 3 is because each point has x,y,z coordinates. It can be helpful to have a HDF file viewer, so you can easily open .HDF5 or .h5 files and see its structure and weather you wrote the file correctly. I used HDFView (it is free). In the train.py file, in lines 62-65, the location of the files (train and test) are detailed. This is actually the location of a .txt file that should contain a list of the file names of the train/test files respectively. You should have that .txt file in the location specified in those lines, or you can change the location/name of file. I recommend that you first try to use the network with the ModelNet40 files. In this process i think you will understand what files you need in which locations and which formats, and how to modify it in the PointNet code, if needed. This is what I did and it helped a lot. If i remember correct, start by trying to run the train.py file, it uses the provider.py file to download the ModelNet40 and create the .h5 files.

Thank you for your information. I am not 100% sure I understood correctly. So my .h5 file has 2 datasets: one for the data and one for the labels. So if my labels are either 0 or 1 (1 dimensional), the data dataset would be NxPx3 and my label dataset would be NxPx1, am I correct? In the label dataset I would just write the correct label at the same position as the data in the data dataset?

And all point clouds need to have the same amount of points, am I correct? (so I would need to sample my data)

Thank you very much!

If the classification for each sample is 1X1 matrix (a scalar) as you described, than you have 1 component for each sample hence the "label" data set should have dimensions NX1 (N rows, 1 component in each row which correspond to the point cloud in the same row on the "data" dataset)

Yaara1 commented 5 years ago

All point clouds should have sufficient number of points (in the code the number of sampled points is either 1024 or 2048, i can't remember but you can modify it yourself), I think the code as it is given is set to sample this number of points (so if you have more points its ok, the required number for the input of the network will be randomly sampled, but i am not sure, i generated point clouds with the exact number of points i wanted)

thirdlastletter commented 5 years ago

@Yaara1 thank you!

csitaula commented 5 years ago

@Yaara1 Hi, I am trying to work for feature aggregation using PointNet. I have created .csv file containing data. I am having trouble how to convert those .csv file into .h5 file. using your two functions you used ? Thanks

euzer commented 5 years ago

@rejivipin I prepared h5 files from LiDAR point clouds. I converted las files to txt files, using CloudCompare software, then prepared numpy files using collect_indoor3d_data.py in the semantic segmentation part (sem_seg), then prepared h5 files using gen_indoor3d_h5.py, also in the semantic segmentation part. I wish this would help you

@majidnasr I transform a point cloud into a .txt, but cannot use the files collect_indoor3d_data.py in the semantic segmentation directory. Can you explain plz?

pournami123 commented 5 years ago

@euzer What error is it showing? Basically, collect_indoor3d_data.py converts the .txt files to .npy format... Maybe there will be some issue with the file path or so...

euzer commented 5 years ago

@euzer What error is it showing? Basically, collect_indoor3d_data.py converts the .txt files to .npy format... Maybe there will be some issue with the file path or so... I don't get any error . in the collect_indoor3d_data there is line where you specify the annotations paths like anno_paths = [line.rstrip() for line in open(os.path.join(BASE_DIR, 'meta/anno_paths.txt'))] In the Stanford3dDataset_v1.2_Aligned_Version there are a lot of directory containing annotation txt files. So i am trying to organize my data in the same way. I run the collect_indoor_3d.py and i get my npy files. To ckeck it, i open the npy file using OPEN3D and it show other scan different (stanford semantic dataset) from mine.

SO how i must organized data?

euzer commented 5 years ago

@pournami123 I don't get any error when running colect_3d_indoor.py to check if the .npy correspond exacty to my original scan, i open the npy by Open3D and it shows me other scan from the stanford dataset? What i am missing in the process ?

majidnasr commented 5 years ago

@euzer If 'collect_indoor3d_data.py' creates the npy files, but from wrong annotations, there is an issue with the paths.

Did you change the annotation's paths in 'meta/anno_paths.txt' based on your dataset? Also, modify line 13 of 'indoor3d_util.py' (the name of your dataset directory inside 'data' folder).

And check if 'collect_indoor3d_data.py' put your output npy files in the same directory as your annotations!

euzer commented 5 years ago

Thank you for your answer @majidnasr !

But i have take into Stanford data which look like this : Stanford3dDataSet Capture11

In my case i only have an equivalent to ConferenceRoom1.txt which coorespond to my room scan that i called myScan.txt.

But i dont'have any txt file about beam, board, ceiling,chair ... Should i create those one again, or can i just use the Stanford ones?

kiranintellify commented 5 years ago

Hello In collect_indoor3d_indoor.py annotations have path to its sub object in whole scene file. I am confused where it uses whole confrence.txt scene or office.txt .As i am preparing for my dataset. so not getting where it is uding whole office or confrence,txt files. Please anyone solve my issue.

euzer commented 5 years ago

Hello In collect_indoor3d_indoor.py annotations have path to its sub object in whole scene file. I am confused where it uses whole confrence.txt scene or office.txt .As i am preparing for my dataset. so not getting where it is uding whole office or confrence,txt files. Please anyone solve my issue.

I ask myself the same question! Each room is composed of a bunch of objects inside of it. For example chair, table, beam, ceiling ..... Each of those objects have a .txt file inside a annotations directory of the stanford semantic data. A whole conference_room.txt contains the values of all object.txt inside of it. LIke a concatenation

sivaprasadraju commented 5 years ago

Hello All, I wanted to train PointNet for classification only. I have pointcloud data(x, y, z, i) in .bin format and corresponding labels in .txt format.

Can anyone please help me how to train PointNet with this data?

euzer commented 5 years ago

Hello All, I wanted to train PointNet for classification only. I have pointcloud data(x, y, z, i) in .bin format and corresponding labels in .txt format.

Can anyone please help me how to train PointNet with this data?

HELLO, To train PointNet on our own data to do semantic segmentation or part classification, you need to convert your data into a final .h5 format. Depending on what you may need (Semantic segmentation or part classification, you refer to the read me file and there is always a train.py script

JMParker-KR commented 5 years ago

Hello, I'm trying to convert my own ply file to .h5 file format by using "write_hdf5.py" According to what I understand, in this code there should be .seg file in my folder to perfectly convert. So my questions are 1.Is it necessary to have .seg file??
2.If is necessary how can I get .seg file (When I converted witout .seg file, my converted data is only 1 Kb, it looks abnormal I got data from ModelNet40 which is OFF format and converted to ply file format by FreeCad)

Thank you.

Maheshiitmandi commented 4 years ago

Hello, i am trying to implement this network with different number of pointcloud samples for lidar data of traffic as m*3 where m is variable for different instances. I am training this network but my prediction is getting stucked to a particular class. I am doing batch size as 1 and training one instance at a time. Should i normalise the data before training? if yes then how mahesh ?

Grungeby52 commented 4 years ago

Hi, I_ want to create an HDF5 data set using only one object STL data like ModelNet40 data set.... Any idea?

JMParker-KR commented 4 years ago

Hi, I_ want to create an HDF5 data set using only one object STL data like ModelNet40 data set.... Any idea?

I did it by changing the file format to ply format and converted to HDF5 by importing h5py and plyfile

Grungeby52 commented 4 years ago

JMParker-KR,

I'm not sure I understand ... There are 3 different things I need to create with stl file for hdf5.

  1. Data (it's okey X, Y, Z)
  2. Label ?
  3. Pid ? Because your production must be suitable for Modelnet40.
JMParker-KR commented 4 years ago

JMParker-KR,

I'm not sure I understand ... There are 3 different things I need to create with stl file for hdf5.

  1. Data (it's okey X, Y, Z)
  2. Label ?
  3. Pid ? Because your production must be suitable for Modelnet40.

As I understand you have to give a label number, for example if your trying to convert airplane the label should be 0(zero), if it is a bottle it should be 5. and I as I know the Pid is for segmentation

Since I used the dataset only for classification I did not use the Pid part.

Grungeby52 commented 4 years ago

thanks, right now i added the stl points to hdf5. Next, pid values of these points remained. How can I do that?

Dawn-LLL commented 4 years ago

hello ,I want to learn how to use my datas use in pointnet
if you want help me my qq is 1922797937 thanks very much

prajaktasurvase6 commented 4 years ago

Hello All, I wanted to train PointNet for classification only. I have pointcloud data(x, y, z, i) in .bin format and corresponding labels in .txt format. Can anyone please help me how to train PointNet with this data?

HELLO, To train PointNet on our own data to do semantic segmentation or part classification, you need to convert your data into a final .h5 format. Depending on what you may need (Semantic segmentation or part classification, you refer to the read me file and there is always a train.py script

How can I convert by .bin data into .txt and .h5 format for further classification and segmentation? Thank you.

Nhunguts commented 3 years ago

Hi, I think you could refer to the issue I previously opened: #94 I simply used the plyfile tool to fetch the points from .ply files as numpy array, and then write these arrays into HDF5 file. You could also refer to my code: https://github.com/IsaacGuan/PointNet-Plane-Detection/blob/master/data/write_hdf5.py

Hi @IsaacGuan , in your write_hdf5.py file, it seems that you just collect the "data" and "pid" for the h5 file, so the h5 files doesn't contain "label", does it?

ema2161 commented 2 years ago

Hi, I have some ASCII or ply format files that just include 3d point cloud so each file contains 3 columns of coordinates without any labels I want to complete these point clouds in a deep learning network, how can I change them into .h5 files? Appreciate!

sid-fitmatch commented 2 years ago

Hi! I am also having this issue. I have my data in h5 format but not sure how to connect labels to it. Did anyone solve this ?

sid-fitmatch commented 2 years ago

What is the command to train for classification of a hdf5 file with corresponding labels txt?

pournami123 commented 2 years ago

The code is there in the repository to connect the labels with h5 data

On Thu, 12 May 2022, 07:15 sid-fitmatch, @.***> wrote:

What is the command to train for classification of a hdf5 file with corresponding labels txt?

— Reply to this email directly, view it on GitHub https://github.com/charlesq34/pointnet/issues/100#issuecomment-1124447118, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH4EQSJOKCM3XT6PT7BLF5DVJRPCTANCNFSM4E3WK4EA . You are receiving this because you were mentioned.Message ID: @.***>

mmazhi commented 2 years ago

@rejivipin 我已经准备好了我的数据,所以我希望这会有所帮助。我只使用 PointNet 进行分类,没有使用对象的法线,只使用对象表面上点的 x、y、z。

此 PointNet 代码使用 ModelNet40 文件进行分类。它采用 .off 格式的 ModelNet40 文件,从中生成 .h5 文件,然后使用 .h5 文件直接馈送到网络。我发现生成 .h5 文件对我来说很舒服。我使用 python 完成了它,尽管我使用 matlab 生成了数据。这是我编写并用于导入 matlab 文件和创建 h5 文件的 2 个函数:

` 导入 h5py

从 .mat 文件到 h5 的数据

def load_matlab_data(fullfilename, pc_variable_name, label_variable_name): matlab_mat = sio.loadmat(fullfilename) pc = matlab_mat[pc_variable_name] label = matlab_mat[label_variable_name] return pc, label

def save_h5(filename, pc, label, partial_path): path = partial_path + filename + '.hdf5' with h5py.File(filename, 'w') as f: f.create_dataset("data", data=pc) f. create_dataset("label", data=label) print(f'file {filename}.hdf5 是在项目目录中创建的,在 {partial_path}') return `

我建议您花几分钟时间阅读有关 HDF5 文件的内容,以大致了解它。正如您在我编写的代码中看到的那样,.h5 文件应该包含 2 个数据集——一个用于数据本身,一个用于标签。在“数据”数据集中,每一行(及其深度)都是一个点云。表示“数据”数据集的维度为 NxPx3,而 N 是文件中点云的数量,P 是单个点云中的点数,3 是因为每个点都有 x、y、z 坐标。 拥有 HDF 文件查看器会很有帮助,因此您可以轻松打开 .HDF5 或 .h5 文件并查看其结构和正确编写文件的天气。我使用了 HDFView(它是免费的)。

在 train.py 文件的第 62-65 行中,详细说明了文件(train 和 test)的位置。这实际上是一个 .txt 文件的位置,该文​​件应分别包含训练/测试文件的文件名列表。您应该在这些行中指定的位置拥有该 .txt 文件,或者您可以更改文件的位置/名称。

我建议您首先尝试将网络与 ModelNet40 文件一起使用。在这个过程中,我认为您将了解您需要哪些文件在哪些位置和哪些格式,以及如何在 PointNet 代码中修改它(如果需要)。这就是我所做的,它帮助很大。如果我没记错的话,首先尝试运行 train.py 文件,它使用 provider.py 文件下载 ModelNet40 并创建 .h5 文件。

@Yaara1 hello! I would like to ask you a question about making a classification dataset. What are the format requirements for the production of classification datasets? If possible, can you introduce the structure of your datasets?