Closed sezan92 closed 3 months ago
Hello! If the different layers of CNN's are covered in an earlier chapter, I think a short refresher review of Convolutional, Pooling and Fully connected layers might be a great start to this chapter before the deep dive into the various CNN architectures.
Hey @ATaylorAerospace ,
If the different layers of CNN's are covered in an earlier chapter, I think a short refresher review of Convolutional, Pooling and Fully connected layers might be a great start to this chapter before the deep dive into the various CNN architectures.
My team is dealing with the general architechture and we would be taking care of all these details , we are yet to finalize on the final content as of now , after which we will release the issue for comments.
This chapter in specific by @sezan92 and team deals with only common pre-trained models .
Thanks
Thank you for the outline @sezan92 :slightly_smiling_face:
Architectures to be added Lenet ( first CNN) Alexnet (First one to use GPU) Vgg16/19 (First deep CNN) Resnet (Residual net helped ) Google (inception model) Mobilenet (first optimized for mobile devices)
I like the historical approach, starting from LeNet and then going further. But for my liking it still fells a bit "too old". I like reflecting on where we come from, but I think the people might also want to know what "modern" CNNs are there, because it is not all about ViTs these days. For this I think you should also add ConvNext (https://github.com/facebookresearch/ConvNeXt) as it is a good representative of current SOTA CNNs
I think the chapter should be diagram-heavy and if possible some implementations of the architectures.
Makes sense for architectures, even nicer would be animations, but I don't know how well that works within .mdx files.
@johko thank you for your reply, I am adding Convnext then . I thought SOTA architectures would be added in the last chapter (modern architectures)
I have a question, how do you think we should proceed about hands-on approach?
+1 for the diagrams. I would suggest making these lessons more practical and keeping only the top maybe 5 most popular CNNs as per their usage in HF with small demos for each, explaining their common use cases. Then they could segway into the transfer learning/fine tuning lessons.
I thought SOTA architectures would be added in the last chapter (modern architectures)
The last chapter is more about experimental and really new architectures. Everything currently SOTA can be covered in the other chapters.
And I agree with @socd06 that it makes ense to focus on maybe 5 very popular CNNs.
I mentioned that I liked the historical approach but just now realized that it might overlap with the general CNN architecture chapter a bit. Which can be good, as long as it is not too much. I think they don't have an outline yet, but maybe @alperenunlu or someone else from the team can already give some info.
I think they don't have an outline yet, but maybe @alperenunlu or someone else from the team can already give some info.
@johko I recently added myself in to the first chapters contributors. I will look into this.
Architectures to be added
- Lenet ( first CNN)
- Alexnet (First one to use GPU)
- Vgg16/19 (First deep CNN)
- Resnet (Residual net helped )
- Google (inception model)
- Mobilenet (first optimized for mobile devices)
By the way some things and mistakes about this list.
LeNet-5 would be more suitable (of course with different last layers)
Alexnet is not the first to use GPU. It's importance is more complex but showed that CNN's can be powerful by winning imagenet.
The name of the architecture is not Google it's GoogLeNet.
Architectures to be added
- Lenet ( first CNN)
- Alexnet (First one to use GPU)
- Vgg16/19 (First deep CNN)
- Resnet (Residual net helped )
- Google (inception model)
- Mobilenet (first optimized for mobile devices)
By the way some things and mistakes about this list.
LeNet-5 would be more suitable (of course with different last layers)
Alexnet is not the first to use GPU. It's importance is more complex but showed that CNN's can be powerful by winning imagenet.
The name of the architecture is not Google it's GoogLeNet.
@alperenunlu thank you. i will correct it. but i thought Alexnet was first to use GPU https://sebastianraschka.com/faq/docs/first-cnn-gpu.html
anyway, that is not the main point.
Hello 👋 I agree with @johko on not including old ones. Except for that it sounds good.
Hello I just joined the team today, we could also show code implementations of these models to show practical use cases?
i have a question for @johko @merveenoyan , what about the implementations? As it is my first time, do we need to implement from scratch (using some framework), or do we show some use cases from model zoos?
I prefer to implement it from scratch. I need to know others' opinions.
Hello I just joined the team today, we could also show code implementations of these models to show practical use cases?
i have some confusion regarding the implementation
Hello I just joined the team today, we could also show code implementations of these models to show practical use cases.
@ShamieCC I have asked pinged you in Discord server. please check
I think the best would be
Each of the architecture is kind of new milestone for CNN.
what do you guys think? @johko @merveenoyan @alperenunlu
Sorry for the late reply @sezan92 , I kinda missed it. I think the proposal is good to start with.
Hey 👋, just wanted to add a follow up question if anyone is working on explaining MobileNet. As that a key component of that paper (MobileNetv2 blocks), is used for an architecture in the common vision transformers section that I am writing.
@Mkrolick. Interesting point. we can add Mobilenet. But I think after this sprint.
@sezan92 Sorry for not responding back! That sounds great. I can also add it in after the sprint if you'd like.
Hello. This issue is for discussion about the chapter Common Pre-Trained Models
My thoughts
In the following paragraphs, I am adding my thoughts. This will be finalized after discussion.
What do you think should be added?
The chapter assumes the reader nows fairly about the CNN algorithms. Now are job is to address the major architectures using CNN.
Architectures to be added
How would you like to explain?
Please let me know your thoughts