martinhofigueiredo / VC

Visão Computacional
2 stars 0 forks source link

Choose from proposals #1

Closed martinhofigueiredo closed 1 year ago

martinhofigueiredo commented 1 year ago

Project 1 – Estimation of the apparent motion

Visual motion perception from a moving observer is the most often encountered case in real life situations. It is a complex and challenging problem, although, it can promote the arising of new applications.

1) Implement two traditional optical flow techniques with multichannel and multiresolution with refinement approaches, namely: (Local) Lucas-Kanade and (Global) Horn-Schunck.

2) Consider the following metrics to assess the quality of your implementation:

3) Discuss the results, taking into consideration the following paper:

Andry et al. (2013), Revisiting Lucas-Kanade and Horn-Schunck, Journal of Computer Engineering and Informatics, Apr. 2013, Vol. 1 Iss. 2, PP. 23-29.

Results must be provided as a table.

4) Consider the image sequences for this project and estimate the optical flow using these two techniques. Produce two videos per image sequence showing the magnitude and orientation of the flow using the color scheme presented in the lectures. Discuss the results obtained.

Image sequences for this project:

Project 2 – Denoising a video sequence. The presence of noise in videos affects subsequent image processing phases, such as three dimensional reconstruction, registration, classification of objects, motion segmentation and analysis, tracking, identification and recognition of humans. Thus, denoising is an extremely important pre-processing phase that is used to improve the perceptual appearance of images; however, a trade-off between noise reduction and data preservation is important to enhance the characteristics of images that are relevant for high level algorithms 1) Implement the robust bilateral and temporal filter (RBLT) for denoising a video sequence. Spatial and temporal components are incorporated into the filter formulation, which increases the filter's ability to remove strong noise components. Consider the Geman-Mcclure or the Charbonnier as error norms for M-Estimators. 2) Consider the following evaluation metrics to assess the quality of your implementation: - SIIM - PNSR The original image (distortion-free or reference), must be compared to the distorted image, using these two evaluation metrics. The distorted image is obtained by corrupting the original image with a distinct noise configuration (Salt-Pepper and Gaussian Noise) and then, the image sample is filtered by each filter, individually. The level of noise that should be added to each original image is 20 to 40 of standard deviation for Gaussian noise and 10 to 30% for the SaltPepper noise. Results must be provided graphically. 3) Discuss the results by taking into consideration the median and Gaussian filter. You can also consider the following paper: Andry et al. (2013), Enhancing dynamic videos for surveillance and robotic applications: The robust bilateral and temporal filter, Signal Processing: Image Communication, Elsevier, 2014. 4) Consider the image sequences for this project and estimate the optical flow using these two techniques. Produce two videos per image sequence showing the magnitude and orientation of the flow using the color scheme presented in the lectures. Discuss the results obtained. Image sequences for this project: - mlky_6 - 210329_06A_Bali_4k_004 - Saint_Barthelemy_2
Project 3 – Captcha decoding. A CAPTCHA (Completely Automated Public Touring test to Tell Computers and Humans Apart) is a commonly used feature in web applications to block non-human access. CAPTCHAs' purpose is to prevent spam on websites, such as promotion spam, registration spam, and data scraping, and bots are less likely to abuse websites with spamming if those websites use CAPTCHA. Many websites use CAPTCHA to prevent bot raiding, and it works effectively. CAPTCHA's design is that humans can complete CAPTCHAs, while most robots can't. 1) This project aims to develop a CNN with ability to decode CAPTCHA images considering 4 and 5 encoders. The model of the CNN needs to be designed, implemented and trained (no fine tuning approaches should be applied); 2) Consider the following metrics: a. Train and test accuracy; b. Confusion matrix; c. Others evaluation methodologies (e.g., confusion matrix, histograms). 3) Discuss the result of your approach, in particular, limitations; 4) Consider the CAPTCHA dataset provided which has 4 to 5 digits. a. Soft dataset is formed by CAPTCHAs that are more simple. Students must start the project with this dataset. b. Hard dataset is formed by CAPTCHAs with strange elements added, to make the identification more difficult to predict.
Project 4 – Open Project Students can develop a project in CV that is related to their MSc Thesis. Therefore, the teams should send a project proposal until the 14th of April, 2023, containing the following topics: - Motivation - Objectives - Problem statement (eg, classification, regression, etc) - Dataset
martinhofigueiredo commented 1 year ago

@josepedrocruz @nmcnascimento preciso que confirmem aqui o que vamos fazer.

martinhofigueiredo commented 1 year ago

2 - Choose project 1

Estimation of the apparent motion