Open rajveerb opened 1 year ago
@kmpark70
Let me know if there's anything that is not included based on the meeting discussion today and if there's anything that you don't understand.
Most importantly, try to answer the above questions below so that I can assign the right task to Harshith before our next meeting.
Question:
The type of object detection or tracking being performed seems to influence aspects such as video preprocessing and the choice of the model used. For instance, when conducting anomaly detection or surveillance detection, the VGG-16 model was predominantly utilized, and the data was often extracted from YouTube-BB, ImageNet VID. However, if object detection is being applied to autonomous driving, the approach may differ. Yolo is dominant in this area and there are a lot of different dataset they used such as nuScenes, Kitti etc. So, my question is what kind of object detection or tracking we are looking for?
Let's focus on object detection in a autonomous driving setting because it is an important problem. Also, focusing on detection tasks using Yolo is great because of its popularity.
So, the model class is Yolo and for the type of object detection focus on autonomous driving and find datasets for that.
Waymo dataset: https://waymo.com/open/
This is among the papers I've read so far that utilize video data for autonomous driving.
Title: Video Preprocessing using neural networks
Processing steps: 1) Data Preprocessing: Remove noise and unwanted artifacts such as spatial and temporal filtering, normalization, and resizing. 2) Feature Extraction: Features are extracted from the video frames using CNNs and RNNs.
Data Used: 1) COCO datasets 2) Kinetics dataset - for video classification
Title: Preprocessing Methods of Lane Detection and Tracking for Autonomous Driving
Preprocessing steps: 1) Extract Images from the video 2) Remove the noise and other unwanted components of the image called image smoothing 3) Region of Interest(ROI) selection, transferring color image into greyscale image or a different color format 4) Inverse perspective mapping(IPM) - remapping each pixel toward a different position, birds-eve view 5) Segmentation - to prepare images for detection stage
Data Used: In this paper, it discusses the preprocessing steps in more detail for lane detection and tracking systems without explicitly specifying the data used for testing.
Some Papers I read and related with the sources:
@kmpark70
Link to the papers that you talked in detail?
Also, is above info enough for you to implement a pipeline end to end?
Preprocessing Methods of Lane Detection and Tracking for Autonomous Driving : https://arxiv.org/pdf/2104.04755.pdf Video Preprocessing neural networks : http://nauchniyimpuls.ru/index.php/ni/article/view/8194/5210 Joint Multiclass Object Detection and Semantic Segmentation for autonomous driving : https://ieeexplore.ieee.org/abstract/document/10098794
I need to start video pre-processing for autonomous driving this week. I believe using datasets like COCO, KITTI, or ImageNet VID should be sufficient for testing. It would be a good idea to reference materials as needed while coding and acquiring the necessary knowledge or information on an as-needed basis. What do you think?
@kmpark70
For the meeting, you need to concretely talk about a pipeline today.
Can you just describe a pipeline in this issue below?
What is the task? What is the dataset? What are the preprocessing operations? What is the model being trained?
I summarized a paper in Word. Can you open and read it? CS4699 10:18.docx
https://arxiv.org/pdf/2209.13508.pdf : this is the link for accessing the paper
Next Task: Try to analyze the preprocessing step in detail and look other paper(still focus on Waymo Dataset).
I summarized a paper in word. 11:1 Task for CS4699.docx
Paper Link: PointAugmenting- Cross-Modal Augmentation for 3D Object Detection.pdf
Github Link: https://github.com/VISION-SJTU/PointAugmenting
The goal is to figure out:
What is not our goal?
Lastly change the existing pipeline code to create a object detection/tracking training ML pipeline