Open geri-brs opened 5 years ago
Machine Learning Researchers have used trial and error and experimentation to see that works and what doesn't. Then other machine researchers built on top of that to improve existing neural network architectures.
I highly recommend that you check out this article (neural network zoo) to familiarize yourself with the most common neural networks used to day and the corresponding problems that they''re solving.
There is also a paper (Hrushikesh Mhaskar, Tomaso Poggio)Deep vs. shallow networks : An approximation theory perspective which argues why Deep (IE many layers between the input and output layers) are better than having shallow layers only.
Deep neural networks are popular because, in theory, they can learn at different levels of abstraction. For example, layers near the input could learn simple features like lines and curves of different orientations. The middle/hidden layers can use these as inputs to distinguish more complex shapes like face parts (eyes, nose, etc) and the layers near the output could use that information to recognize specific faces based on facial structures. It is very hard to do transfer learning if you only have a small number of hidden layers (IE shallow networks).
You can also use existing pretrained layers of your deep network for other other purposes (layers that detect different kinds of edges for example) This is called transfer learning. You can learn more about transfer learning here (machine learning mastery) and here Dipanjan (DJ) Sarkar
We use activation functions to include non-linearity to our neural network.
It depends on what you are trying to achieve and also trial and experimentation. All of them have pros and cons. Here's a brief summary of some types of (SAGAR SHARMA)activation functions.
Here is another article that explains which activation functions are "hot right now" Anish Singh Walia
Some other articles that discuss activation functions:
One of my favorites activation functions that I use is ELU which has been experimentally proven to produce better results than RELU, but is much more computationally expensive than RELU
Selu is elu with a little twist. Learn more about it here:
supervised is when you know and tell answer to a question that your machine learning model is trying to understand (IE Is this a hotdog or is it not a hotdog?). Supervised learning works as if your machine learning model is a child and you're teaching it to know the difference between a picture of a hotdog and picture that isn't. So what you do is you give the pictures of hotdog (and pictures of non hot dogs) where each picture is labeled. Then the model tries to understand what patterns make something a hotdog.
nonsupervised on the other hand is that you give it a bunch of data with no labels and the model tries to find underlying patterns in the data. For example you work at amazon and there are many different shapes and dimensions of each item that you ship. Since the items are varying in sizes, it's impractical to make a box with unique dimensions tailored to each item, so you make 3 types of boxes, small, medium, and large. Given the dimension of the items being shipped, what are the best dimensions for each type of box (small, medium, large) to be not to be wasteful? (we don't want most of our shipments to be inside a box that is mostly filled with air or worst, the item is too large to fit in the largest box we have. So we can use machine learning to find patterns to be able to choose which dimensions of boxes are best to produce.
See also:
I watched videos about the recurrent neural networks many times. It's a little hard to understand the whole concept but I keep trying :)) Is there any platform or website or any animation where I can see the process of working ?? For example I can see step by step that neural network gets the input, what we multiply a weight matrice and we come into the hidden state and so on. I think it would be easier to understand if I can see this process.