[Deep Learning Papers Reading Roadmap]을 따라 첫 번째 논문 요약합니다!
LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. "Deep learning." Nature 521.7553 (2015): 436-444.
Deep Learning
-multiple processing layers
-learn representations of data through abstraction
-->SotA speech recognition, visual object recognition, object detection, drug discovery, genomics, etc.
-method: backpropagation algorithm, deep convolutional nets(images/video/speech/audio)(CNN), recurrent nets(text/speech)(RNN)
Machine Learning
-limits: processing raw data
-->Representation Learning
: automatically discover representations for detection/classification
*layers of features<--general-purpose learning procedure
Supervised Learning
-most common form of ML
-CLASSIFICATION--"label"
[input-output]; weight vector
<-gradient vector, objective function
->simple; *SGD(Stochastic Gradient Descent): AVERAGE of gradients (~stop decreasing)
*Classifier
ML; linear classifier! (pixel) ---"Selectivity Invariance Dilemma"
DL; NON-linear
Backpropagation
<- *Chain Rule
"poor local MINIMA"; X big deal
[Deep Learning Papers Reading Roadmap]을 따라 첫 번째 논문 요약합니다! LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. "Deep learning." Nature 521.7553 (2015): 436-444.
Deep Learning -multiple processing layers -learn representations of data through abstraction -->SotA speech recognition, visual object recognition, object detection, drug discovery, genomics, etc. -method: backpropagation algorithm, deep convolutional nets(images/video/speech/audio)(CNN), recurrent nets(text/speech)(RNN)
Machine Learning -limits: processing raw data -->Representation Learning : automatically discover representations for detection/classification *layers of features<--general-purpose learning procedure
Supervised Learning -most common form of ML -CLASSIFICATION--"label" [input-output]; weight vector <-gradient vector, objective function ->simple; *SGD(Stochastic Gradient Descent): AVERAGE of gradients (~stop decreasing)
*Classifier ML; linear classifier! (pixel) ---"Selectivity Invariance Dilemma" DL; NON-linear
Backpropagation <- *Chain Rule "poor local MINIMA"; X big deal
**CNN(Convolutional Neural Network) / ConvNet (2): Convolution Layer(Feature map), Pooling Layer
Distributed Representations and Language Processing 2 advantages; 1.generalization to new combinations / 2. another exponential advantage
RNN(Recurrent Neural Networks) -sequential inputs(speech, language) *"hidden state"
Future of DL Unsupervised Learning!!!! Natural Language Understanding <- *RNN