Deep Learning Practice

Basically neural network based implementation and corresponding notes.

More "general" machine learning notes will be noted in my Machine Learning repository.

If you want to clone this repository, please use the following command:

GIT_LFS_SKIP_SMUDGE=1 git clone https://github.com/daviddwlee84/DeepLearningPractice.git

The notes of this repository haven't updated for a long time, I will update it once I organize my local notes.

Environment

Using Python 3

Dependencies

tensorflow
- github
- Brief Notes - Placeholder, Graph, Session
- TensorFlow 2.0 Notes
- Model Save and Restore Notes - ckpt, transfer learning
- Data Manipulating Notes - TFRecord, Iterator
- Multi-thread Notes
- High-level API Notes - tf.keras, tf.layer
- simple demos with maybe jupyter notebook?!
keras
- github
- Brief Notes
pytorch
- github
- Brief Notes
- torch friends
- tensorboardX - tensorboard for pytorch (and chainer, mxnet, numpy, ...)
- pytorch-lightning - The lightweight PyTorch wrapper for ML researchers. Scale your models. Write less boilerplate
- tnt - is torchnet for pytorch, supplying you with different metrics (such as accuracy) and abstraction of the train loop
- inferno and torchsample - attempt to model things very similar to Keras and provide some tools for validation
- skorch - is a scikit-learn wrapper for pytorch that lets you use all the tools and metrics from sklearn

Project

PKU Courses and Some side projects

Basically based on TensorFlow 1.x and Keras
Begin with the most basic model > CV > NLP

Subject	Technique	Framework	Complexity	Remark
Perceptron Practice	SLP, MLP	Numpy	○○●●●	Truth Table (AND, OR, XOR) and Iris Dataset (simulate Keras API)
Softmax Derivation	FCNN	Numpy	○○○●●	Backpropagation of Softmax with Cross Entropy Loss
MNIST Handwriting Digit	FCNN	Tensorflow (and tf.keras)	○○●●●	Implement by different ways
Semeion Handwritten Digit	FCNN	Tensorflow	○○○●●	Made a Tensorflow like Dataset Class
CIFAR-10	FCNN, CNN	Tensorflow	○○●●●	Comparison of FCNN and CNN
Chinese Named Entity Recognizer	RNN, LSTM	Tensorflow	○●●●●	TODO: LSTM testing
Flowers	CNN	Tensorflow	○○●●●	Transfer Learning
Fruits	CNN	Tensorflow (and tf.layer)	○○●●●	Multi-thread training and TFRecord TODO: Try more complex model
Trigonometric Function Prediction	RNN	Tensorflow	○○○○●	Predict sine, cosine using LSTM
Penn TreeBank	RNN, LSTM	Tensorflow	○○●●●	Language corpus preprocessing and training
Chinese Neural Machine Translation	RNN, Attention	Tensorflow	○●●●●	A practice of Seq2Seq and Attention TODO: Multi-graph, Try transformer
Dogs!	CNN	Keras	○○●●●	Using images from ImageNet, Keras Transfer learning and Data augmentation
2048	FCNN with Policy Gradient	Tensorflow	●●●●●	Reinforcement Learning
Text Relation Classification	Multiple Models	Multiple Libraries	●●●●●	SemEval2018 Task 7 Semantic Relation Extraction and Classification in Scientific Papers
Medical Corpus	Human Labor	Naked Eyes	●●●●●	From Chinese word segmentation to POS tagging to NER
Word Sense Induction	Multiple Models	Multiple Libraries	●●●●●	SemEval2013 Task 13 Word Sense Induction for Graded and Non-Graded Senses
Chinese WS/POS/(NER)	RNN, CRF	TansorFlow	●●●●●	The "from scratch" version of the previous project ("Medical Corpus") (paper)
Toxicity Classification	BiLSTM	Keras	●●●●●	Jigsaw Unintended Bias in Toxicity Classification - Detect toxicity across a diverse range of conversations
CWS/NER	RNN, CRF	TensorFlow	●●●●●	The sequence labeling model on the classic Chinese NLP task

NLP PyTorch

Basically based on PyTorch and most of the contents are NLP

Subject	Technique	Framework	Complexity	Remark
Machine Translation	RNN, Transformer	PyTorch	●●●●●	Machine translation model from Chinese to English based on WMT17 corpus (use result of CS224n)
Sentence Similarity	RNN	PyTorch	●●●●●	Enhanced-RCNN and other baseline models on some sentence similarity dataset

Other Projects

NCTU DL Course

Subject	Technique	Framework	Complexity	Remark

Deep Learning Categories

TODO: Tasks, Subtasks, Structure, General Architecture, Elements, State-of-the-art model

General Architecture (DNN, CNN, RNNs, Atteniton, Transformer)

Categorized by Learning (supervised, ...)

Categorized by Tasks (NMT, NER, RE, ...)

Categorized by Structure (Seq2seq, Siamese)

Categorized by Learning Framework (GAN ?!)

State-of-the-art models and papers (BERT, ...)

Technique / Network Structure

Feedforward Neural Network
- Multilayer Perceptron (MLP)
Fully Connected Neural Network (FCNN) - And an overview of neural network training process including forward and back propagation
- Dense Neural Network (DNN)

Image Learning

Convolusion Neural Network (CNN)

Sequence Learning

Basic Block for Sequence Model!

Recurrent Neural Network (RNN) - Basis of Sequence model
Long Short Term Memory (LSTM) - Imporvement of "memory" (brief introduce other regular RNN block)
Gated Recurrent Units (GRUs)

`Reinforcement Learning (RL)`

Q Learning
Policy Gradient Methods (PG)

Uncategorized

Generative Adversarial Network (GAN)
Variational Autoencoder (VAE)
Self-Organizing Map (SOM)

Learning Framework / Model

Object Detection

You Only Look Once (YOLO)

Text and Sequence

Sequence-to-Sequence (seq-to-seq) (Encoder-Decoder) Architecture - Overview of sequence models
- Bidirectional RNN (BRNN) - RNN-Based seq-to-seq
- Convolution-based seq-to-seq
- Attention Model - Transformer-based seq-to-seq
- Transformer - Attention Is All You Need - Transformer-based multi-headed self-attention
Word Piece Model (WPM) aka. SentencePiece

Transfer Learning in NLP

"Pre-training in NLP" ≈ "Embedding"

ELMo
BERT
GPT
- openai/gpt-2
- OpenAI GPT-2: An Almost Too Good Text Generator - YouTube
XLNet

Others

Neural Architecture Search

Ingredient of magic

Layer

BatchNorm
Convolution
Pooling
Fully Connected (Dense)
Dropout
Linear
LSTM
RNN

General speaking

Input
Hidden
Output

Activation Function

Sigmoid
Hyperbolic Tangent
Rectified Linear Unit (ReLU)
Leaky ReLU
Softmax

Loss Function

Cross-Entropy
Hinge
Huber
Kullback-Leibler
MAE (L1)
MSE (L2)

Optimizer / Optimization Algorithm

Exponential Moving Average (Exponentially Weighted Moving Average)
Adadelta
Adagrad
Adam
Conjugate Gradients
BFGS
Momentum
Nesterov Momentum
Newton’s Method
RMSProp
Stochastic Gradient Descent (SGD)

Parameter

Learning Rate: Used to limit the amount each weight is corrected each time it is updated.
Epochs: The number of times to run through the training data while updating the weight.

Regularization

Data Augmentation
Dropout
Early Stopping
Ensembling
Injecting Noise
L1 Regularization
L2 Regularization

Common Concept

Big Pucture: Machine Learning vs. Deep Learning

Terminology / Tricks

one-hot encoding
ground truth
Data Parallelism
Vanilla - means standard, usual, or unmodified version of something.
- Vanilla gradient descent (aka. Batch gradient descent) - means the basic gradient descent algorithm without any bells or whistles.

Tricks for language model - a sort of overview

Word Representation
- Embedding
- Train Embedding
CNN for NLP
RNN for NLP
Capsule net with GRU
- Kaggle kernel - Capsule net with GRU
- Kaggle kernel - Capsule net with GRU on Preprocessed data

Applications

CV

Image Classification

NLP

Basis
- Text segmentation
- Part-of-speech tagging (POS tagging)
Speech Recognition
- End-to-End Models:
  - (Traditional --> HMM)
  - CTC
  - RNN Transducer
  - Attention-based Model
- Improved attention
  - Single head attention
  - Multi-headed attention
- Word Pieces
- Sequence-Training
  - Beam-Search Decoding Based EMBR
Named Entity Recognition (NER)
Neural Machine Translation (NMT)
- Encoder LSTM + Decoder LSTM
- Google NMT (GNMT)
Speech Synthesis
- WaveNet: A Generative Model for Raw Audio
- Tacotron: An end-to-end speech synthesis system
Personalized Recommendation
Machine Translation
Sentiment classification
Chatbot
- Sequential Matching Network (SMN)

Books Recommendation

Deep Learning - MIT
Dive into Deep Learning (D2L Book) (d2l.ai) / 動手學深度學習
Speech and Language Processing 2ed.
Deep Learning with Python

Tools

Visualize Drawing Tools

NN-SVG - FCNN, LeNet, AlexNet style
- github
draw.io
Netscope
Graphviz - Graph Visualization Software
- Keras model visualization
- pydot

Latex

HarisIqbal88/PlotNeuralNet

Toy

martisak/dotnets

Resources

Dataset/Corpus

Corpus/NLP Dataset

SemCor
SENSEVAL
SemEval
- wiki
WMT17
- News
- Chinese-English

Animate Dataset

nico-opendata
Danbooru2018 - A Large-Scale Crowdsourced and Tagged Anime Illustration Dataset
MyAnimeList Dataset
- MyAnimeList

Github Repository

Example

hadikazemi/Machine-Learning

Summary

brightmart/text_classification - all kinds of text classification models and more with deep learning

Application

Leon
- leon-ai/leon

Mature Tools

NLP

Chinese
- jieba
English
- spaCy - Industrial-Strength Natural Language Processing in Python
- explosion/spaCy
- gensim
- nltk
- fairseq - Facebook AI Research Sequence-to-Sequence Toolkit

Tutorial

Course

Interactive Learning

MOOC

Stanford - CS231n: Convolutional Neural Networks for Visual Recognition
Stanford - CS244n: Natural Language Processing with Deep Learning
- Winter 2019 - first time using PyTorch
- Videos
- Winter 2017 - using TensorFlow
- Videos
- Hank's blog (github)
- CS224n Chinese camp
MIT Deep Learning
- Github - Tutorials, assignments, and competitions for MIT Deep Learning related courses
PKU - 人工智慧實踐：Tensorflow筆記

Document

DeepNotes
- deepnet - Implementations of CNNs, RNNs and cool new techniques in deep learning from scratch
UFLDL Tutorial
- Starter Code - github
Machine Learning Cheatsheet
深度學習500問
Machine Learning Notebook

Github

Slides

Supervised Deep Learning

Conclusion

NLP

crownpku/Awesome-Chinese-NLP: A curated list of resources for Chinese NLP

Summaries

NLP

Awesome Computer Vision - A curated list of awesome computer vision resources

Article

NLP

初入NLP領域的一些小建議

Lexical Database

Other

Manipulate Github Large File (>100MB)

.gitattributes

Git large file storage
Bitbucket tutorial - Git LFS
Configuring Git Large File Storage
Moving a file in your repository to Git Large File Storage
- BFG Repo-Cleaner - brew install bfg
  - github
- Removing sensitive data from a repository - git filter-branch

Time measure

Python decorator to measure the execution time of methods

Export Markdown

Pandoc
- User Manual
toDOCX
- pandoc -o output.docx -f markdown -t docx filename.md
- PDFtoDOCX
toPPT
- Smallpdf PDF to PPT Converter

Machine Learning/Deep Learning Platform

Deprecated notes

h5py - HDF5 for Python: To store model in HDF5 binary data format
pyyaml - PyYAML: YAML framework

Programming Framework

Framework	Organization	Support Language	Remark
TensorFlow	Google	Python, C++, Go, JavaScript, ...
Keras	fchollet	Python	on top of TensorFlow, CNTK, or Theano
PyTorch	Facebook	Python
CNTK	Microsoft	C++
OpenNN	C++
Caffe	BVLC	C++, Python
MXNet	DMLC	Python, C++, R, ...
Torch7	Facebook	Lua
Theano	U. Montreal	Python
Deeplearning4J	DeepLearning4J	Java, Scala
Leaf	AutumnAI	Rust
Lasagne	Lasagne	Python
Neon	NervanaSystems	Python

Pending Project

Subject	Technique	Framework	Complexity	Remark
Online ImageNet Classifier	CNN	Keras	○○●●●	(TODO) Using Keras Applications combine with RESTful API
First TF.js	(TODO) Using TensorFlow.js to load pre-trained model and make prediction on the browser
YOLO	CNN	Tensorflow	(TODO) Real-time Object Detection
Word Similarity	(TODO) Word Similarity Based on Dictionary and Based on Corpus

daviddwlee84 / DeepLearningPractice

readme