PiLab-CAU / ImageProcessing-2402

Image processing repo
MIT License
0 stars 1 forks source link

[Lecture2-2][0912] Is CNN same with sparsely connected NN? and how bout using stochastic filter? #2

Closed suwanly closed 2 weeks ago

suwanly commented 2 weeks ago

Dear Prof. Yoo, I am reaching out to you as I have two questions I would like to ask.

Q1: I understand that a CNN is just a neural network with a sparse connection layer, is it correct?

Q2: CNN seems to tend to focus too much on local relationships. To solve this problem, How about stochastically selecting pixels for convolution operations? If nearby pixels are selected with a high probability and distant pixels are selected with a low probability, the relationship between distant pixels can be established well. (e.g using Gaussian distribution) I am also curious if any similar research has been conducted. Thank you.

yjyoo3312 commented 2 weeks ago

@suwanly

Thank you for the nice and the first question!

A1: Yes, given CNN layer F, and given input x and output, we can represent the convolutional operation as y = Wx, where the W is a 2x2 matrix sharing the element of CNN layer F (for more details, refer to the notations in the paper). Additionally, it’s crucial to note that the CNN applies the same filter across the entire image.

A2. Yes, as you suggested, stochastically selecting pixels for the convolution operation would allow the filter to cover more image regions. However, the main issue is that during inference, the results will vary due to the random selection. Therefore, stochastic selection should be limited to the training phase (without incorporating uncertainty aspects). Additionally, optimizing the convolution operation with stochastic selection can be challenging. Instead, randomly varying the dilation rate during training might achieve a similar effect with simpler implementation. This idea seems novel, and it’s worth exploring!

Note: The nice thing of CNN is that CNN focuses on local relationship with shared weight-parameters and it give us some kind of inductive bias useful for interpreting images. I think we will have a chance to discuss it later in this class.

suwanly commented 2 weeks ago

Dear Prof. Yoo, @yjyoo3312

Thank you for the detailed answer. I also think my idea on Q2 is far from CNN's initial ideas.

In my opinion, "randomly varying the dilation rate during training" looks like it has a similar effect as the initial idea of "stochastically selecting pixels for the convolution operation". (a really good idea! How about using a large-size filter and random dilation with weight decay? I think it can achieve a much more similar effect as the idea in Q2.)

Also, I understood CNN gives us a kind of inductive bias and that it is an important pros of using CNN.

Once again, thank you for your sincere answer. Have a nice weekend!

Sincerely, Yoon Suwan

yjyoo3312 commented 2 weeks ago

@suwanly

It's great that you're considering architectural modifications! Using larger-sized filters can increase the receptive field but also presents several challenges. We'll cover this topic in more detail in the next class.

Have a wonderful Sunday!

suwanly commented 2 weeks ago

Dear Prof. Yoo @yjyoo3312 Thanks for the reply! I appreciate you taking the time to go into detail about this.

I'll close this issue! Thank you!