johko / computer-vision-course

This repo is the homebase of a community driven course on Computer Vision with Neural Networks. Feel free to join us on the Hugging Face discord: hf.co/join/discord
MIT License
376 stars 124 forks source link

General Architechture : Draft Outline #49

Closed ash-01xor closed 2 months ago

ash-01xor commented 8 months ago

Hey Everyone

This issue is for discussion about the chapter General Architecture. So our team planned to keep it as simple as possible and also make it informative.

General Architechture of CNN’s:

Open to all suggestions for improving the draft . Team members: @jucamohedano @youssefadr @alperenunlu bcc : @johko @merveenoyan @lunarflu

johko commented 8 months ago

Hey, thanks for the outline, sounds pretty reasonable to me :slightly_smiling_face:

For the first part you might want to get in touch with the people from the "Feature Extraction" section to check what they mention about convolutional filters to not have too much overlap.

You could also think about adding a part about Skip Connections, even though they are not that exclusive to CNNs, they still are good to know about.

Do you also plan to speak about specific models as examples or do you want to keep this chapter rather abstract? Both works for me, but might be good to know for the people working on the next chapter (whose outline is here: https://github.com/johko/computer-vision-course/issues/39)

alperenunlu commented 8 months ago

You could also think about adding a part about Skip Connections, even though they are not that exclusive to CNNs, they still are good to know about.

In the Various types of Convolutions we can explain it with other topics like DepthWise, Depthwise-Separable.



Do you also plan to speak about specific models as examples or do you want to keep this chapter rather abstract? Both works for me, but might be good to know for the people working on the next chapter (whose outline is here: https://github.com/johko/computer-vision-course/issues/39)

It is inevitable to speak about models if we want to explain best practices on CNN's. Instead of further analyzing and then implementing the models we can reference it like:

sezan92 commented 8 months ago

Implementation of convolution in Pytorch (also Jax)

Is there any reason to use two frameworks? I think it is better to use one single framework across all the chapters, and as pytorch has more support, it is preferrable for me. (no problem with Jax)

lunarflu commented 8 months ago

Is there any reason to use two frameworks? I think it is better to use one single framework across all the chapters, and as pytorch has more support, it is preferrable for me. (no problem with Jax)

cc @merveenoyan for her thoughts on this 🤗

arkajyotimitra commented 8 months ago

Is there any reason to use two frameworks? I think it is better to use one single framework across all the chapters, and as pytorch has more support, it is preferrable for me. (no problem with Jax)

Since this is community-driven, there might be instances where people would happily contribute a version of implementation in a different framework which can be helpful to someone in need and it could well be added as extra note as [[example]()] in the notebook or as toggle (mentioned in the main page)...it is more like sub branch of a branch in a big tree.

For a start, PyTorch or Tensorflow seems reasonable. JAX is a bonus. Just a little mention such as open to implementation in different frameworks can be helpful to let readers know that they can help/contribute as well.

Quality checks on the implementations can be done by expert reviewers (who knows that framework) before merging them. Hope this helps and would love to hear from others. 🤗

ash-01xor commented 8 months ago

Implementation of convolution in Pytorch (also Jax)

Is there any reason to use two frameworks? I think it is better to use one single framework across all the chapters, and as pytorch has more support, it is preferrable for me. (no problem with Jax)

Well, we felt that the research community finds Jax interesting and is also able to provide considerable speed up. providing examples of how Jax improves the speed of mathematical operations , might help the reader understand how important these operations are and why the rate at which these operations takes place plays a pivot role while building big architechtures or networks.

sezan92 commented 8 months ago

Implementation of convolution in Pytorch (also Jax)

Is there any reason to use two frameworks? I think it is better to use one single framework across all the chapters, and as pytorch has more support, it is preferrable for me. (no problem with Jax)

Well, we felt that the research community finds Jax interesting and is also able to provide considerable speed up. providing examples of how Jax improves the speed of mathematical operations , might help the reader understand how important these operations are and why the rate at which these operations takes place plays a pivot role while building big architechtures or networks.

as there are other chapters being developed and they are also contemplating about frameworks, I think it is best to discuss with everyone about implementing two frameworks in one book chapter.

merveenoyan commented 8 months ago

Hello 👋 it would be nice to make this section very beginner-friendly and intuitive at first. I would talk a bit about why we do convolution/cross-correlation and have filters under it as a subsection (like sobel filter) and later go through why we need ConvNets, given it's very inefficient to try random filters all the time. Have the implementation later on after intuition.

Given PyTorch is very pythonic and easier to read I'd rather have this only in PyTorch and not in JAX (it's very functional and hard to read/debug IMHO, but good for efficiency, which we don't need here). Moreover, to ship fast, we will write everything in PyTorch to standardize the course (and after v1 we'll likely add TF or Keras Core based PT first and then JAX).

By means of intuition, I've previously written an illustrative blog post on intuition behind convolution operation, filters, what goes on inside a ConvNet so something along the line of this is would be nice for the theoretical part: https://merveenoyan.medium.com/complete-guide-on-deep-learning-architectures-chapter-1-on-convnets-1d3e8086978d?source=friends_link&sk=cd4d3f0139d539c668a631d062d68357