EB Classification Model

aitsam12 commented 1 year ago

I am trying to train a classification model for my dataset (which I generated from Silky Camera). When I record from event camera, I get only .raw file. I am converting those .raw files to .h5 format by using generate_hdf5.py file. Then I am running the train_classification.py file and providing the output path and dataset path. I am getting following error:

File "/home/aitsam/.local/lib/python3.8/site-packages/pytorch_lightning/utilities/data.py", line 114, in has_len_all_ranks total_length = training_type.reduce(torch.tensor(len(dataloader)).to(model.device), reduce_op="sum") File "/usr/lib/python3/dist-packages/metavision_ml/data/sequential_dataset.py", line 469, in __len__ return len(self.dataset) // self.dataset.batch_size ZeroDivisionError: integer division or modulo by zero Both len(self.dataset) and self.dataset.batch_size values appear to be zero even when batch_size is not zero in default arguments. Can you tell me how to solve this issue?

How can I record from Silky Camera in .dat format?
In your EB classfication example, it is mentioned that we also need corresponding .npy file for each .h5 file. How to get this file as it does not generate when we record from camera.
Can you provide the dataset which you used to train the rock, paper, scissors example. Maybe the guideline on how that dataset was recorded?

lbristiel-psee commented 1 year ago

Hello @aitsam12

Sorry for the delay to answer.

when I record from event camera, I get only .raw file.

yes, this is normal. All the data produced by the sensor is in the RAW file. See: https://docs.prophesee.ai/stable/data/file_formats/raw.html

Then I am running the train_classification.py file and providing the output path and dataset path. I am getting following error:

Please check the doc of the Train Classification sample in Did you check the doc of the train classification sample? In there you will see what should be provided as input of the sample:

path to the output folder
path to the training dataset:
- a folder containing 3 sub folders, named train, val, test.
- each subfolder should contain one or multiple h5 files and their corresponding
  _bbox.npy labels. The label is by default set to EventBbox format, except that only the column “ts” and “class_id” are actually used, the rest can be set to a constant dummy value.
- a dictionary file named label_map_dictionary.json, which contains all the classification categories.

How can I record from Silky Camera in .dat format?

You can not record in .dat You can record in .raw and then convert in .dat with metavision_file_to_dat See https://docs.prophesee.ai/stable/metavision_sdk/modules/driver/samples/file_to_dat.html

in your EB classfication example, it is mentioned that we also need corresponding .npy file for each .h5 file. How to get this file as it does not generate when we record from camera. Can you provide the dataset which you used to train the rock, paper, scissors example. Maybe the guideline on how that dataset was recorded?

yes, you have to provide the .npy files that contain the ground truth for the machine learning. Currently we don't provide the dataset for rock, paper, scissors. But you can check our Knowledge Center Article about our labelling tool that can give you some insights on our .npy files creation: https://support.prophesee.ai/portal/en/kb/articles/test-machine-learning-labeling-tool

Hope this helps, Laurent for Prophesee Support.

aitsam12 commented 1 year ago

Thank you for your reply.

I have more questions which are related to dataset preparation for classification model.

csv to raw encoding:

Q1) I have csv files of events with columns (x,y,p,t). I encoded them to evt2.0 raw files. when i visualized this raw file, it shows that events are happening on left corner of screen and rest of the screen is empty. I think there is some issue with the resolution because the event in csv file are recorded with dvs 128x128 camera. any solution for this?

Machine learning labeling tool:

I got 30 raw files for each class. I converted all of them to .avi videos. Now i want to label them for ground turth.

Q1) is there any way to do it automatic for all the files of same class?

Q2) i started doing the bbox labeling manually but for each file of same class object id is updating. e.g. for first file object id was [0,0], for second file object id when i draw the box was [1,0]. i tried changing it but couldn't. can you tell me how to label the files of same class coz all refers to same object.

Q3) after conversion from raw to avi, i got 2 files (txt and npy). I want to know what each column represent in txt file.

Q4) I am working with hand gesture dataset and there is only one type of events happening in each avi video. lets say it is 'hand wave'. Now in all the frames its only hand wave. What i am doing currently is making a bbox at the place in video where hand wave is happening, give them object id and class id and then i keep pressing 'R' until it go through all the frames. Is this the correct approach? or is there any better way to do this? Screenshot from 2023-05-26 15-16-26

aitsam12 commented 1 year ago

Hi, I am still stuck with these questions.

lbristiel-psee commented 1 year ago

Hello @aitsam12 ,

I think there is some issue with the resolution because the event in csv file are recorded with dvs 128x128 camera. any solution for this?

In your case, our SDK is not able to detect that the resolution is 128x128 so it is using a Prophesee-device-resolution. You should check the header of the RAW file you created to see if you can set the resolution there. See https://docs.prophesee.ai/stable/data/file_formats/raw.html. But the header format also depends on the version of the SDK you are using. Then you also have to check that the code you are using to read the RAW file is actually checking the resolution/geometry of this RAW file.

Q1) is there any way to do it automatic for all the files of same class?

no

Q2) i started doing the bbox labeling manually but for each file of same class object id is updating. e.g. for first file object id was [0,0], for second file object id when i draw the box was [1,0]. i tried changing it but couldn't. can you tell me how to label the files of same class coz all refers to same object.

when you are doing the labeling, for every bbox you can change the object ID or the class ID. This is explained in the help that is shown when you launch the tracking tool:

When a bbox is selected, pressing a number key puts you in the change id mode.
As soon as a number is pressed while the bbox is selected, you may change the object or class id of a bbox.
A small window opens up with the first number you type
As long as you keep pressing numbers, it will write them on this window.
It represents the id that will replace the current one of the bbox
Pressing the C key changes the id you will update: when you enter the mode, it is set to change the object_id.
Pressing the C key will switch between changing the class_id and the object_id
Changing the class id of a bbox will affect every bbox that has the same class id, in the future or in the past
Changing the object id of a bbox will affect every time contiguous bbox that has the same object id, in the future or in the past
You can't give a bbox the same object id as a bbox existing in the current frame. The script will prevent it.
Pressing the Esc key cancel the id modification
Pressing the Backspace key erase the last number entered
Pressing the Enter key validates the id modification

Q3) after conversion from raw to avi, i got 2 files (txt and npy). I want to know what each column represent in txt file.

The columns of the files are explained in this page: https://docs.prophesee.ai/stable/metavision_sdk/modules/ml/samples/bbox_txt2npy.html

Q4) I am working with hand gesture dataset and there is only one type of events happening in each avi video. lets say it is 'hand wave'. Now in all the frames its only hand wave. What i am doing currently is making a bbox at the place in video where hand wave is happening, give them object id and class id and then i keep pressing 'R' until it go through all the frames. Is this the correct approach? or is there any better way to do this?

yes, this is the right way.

Hope this helps, Laurent

aitsam12 commented 1 year ago

I think I am doing some mistake in labelling the data (not sure) 84_percent_torchjit_model.zip . I am working on EB classification model for hand gestures. I collected the data from CenturyArk SilkyCam (model: EvC3a). I converted the raw files into avi video for labelling. With labelling tool I created a box on the whole area covered by hand during its movement then I keep pressing 'R' so it pass through all frames.

Then I train the model for this dataset. So far I got 84% (attached) accuracy in one of the checkpoint. Then when I do the classification inference, camera doesnot classify anything not even detect background OR does not create a detection box if i do any gesture in front of it. (I am using the same distance which i use to create dataset)

On the other hand, if I provide .h5 file from test data then it still not classify.

Q1: is there problem with my dataset labelling? if yes, then how to label the the frames which are continuously moving. Do I have to make a box exactly where movement is happening and then drag the box along with the movement?

Q2: for my training data (.h5 files) the delta_t was 50000 and label_delta_t was 5000. Could this be the problem for this behaviour of model?

Q3: Do I have to collect more data. currently I have 20 samples for each class.

aitsam12 commented 1 year ago

i accidently closed the issue. I still have all the questions I mentioned above. Thanks

prophesee-ai / openeb

EB Classification Model #78

_bbox.npy labels. The label is by default set to EventBbox format, except that only the column “ts” and “class_id” are actually used, the rest can be set to a constant dummy value.