FLUID models roadmap - Githubissues

mrysztow commented 6 years ago

Do you have a roadmap of introducing models to FLUID? We would like to align our plan of providing MKL-DNN OP kernels with model roadmap.

luotao1 commented 6 years ago

In #7561, we have discussed PaddlePaddle's 10 aspects in 2018. For models of FLUID, there are NLP,Speech and Image support:

NLP support: Enhance the CPU power of some NLP ops (mainly RNN/LSTM/GRU) based on specific workload.
Speech support: Enhance the CPU power of some Speech ops (mainly RNN/CNN) based on specific workload.
Image support: Enhance the CPU power of some Image ops (mainly CNN and Detection) based on specific workload.

lcy-seso commented 6 years ago

I give some brief information on NLP support. We have a plan to first focus on some state-of-art models in neural machine translation task in NLP field.

Generally, for most many NLP tasks, RNN modules (including but not limited to) LSTM (LSTM operator , LSTM unit operator), GRU (GRU operator，GRU unit operator), and Simple RNN, are among the most important compuatation units. These operators are expected to be highly optimized.

For NMT task, three models are expected to be highly optimized:
- RNN search (RNN encoder-decoder with attention): Fluid's implementation
- ConvS2S: Fluid's implementation is WIP.
- Transformer: Fluid's implementation
About the time schedule:
- The first milestone for the NMT task is at the end of March, at then, the implementation of the three models are finished, their learning performances are verified and a basic profiling report is given. You can check this project: https://github.com/PaddlePaddle/Paddle/projects/37
Important operators expected to be highly optimized:
- matmul operator
- layer normalization

mrysztow commented 6 years ago

Thank you for pointing particular NMT models. What topologies are the most important for image recognition and speech? Are they Resnet-50 and DS2?

luotao1 commented 6 years ago

@qingqing01 @kuke Can you help to answer it? Thanks very much!

qingqing01 commented 6 years ago

About computer vision, what we are doing now are as follows:

Image Classification , now mainly SE-ResNeXt model. The mainly related operators are:
- Conv2D, Pool2D, GlobalPooling, BatchNorm, Fc, ElementwiseMul( support broadcast), SoftMax, CrossEntropy, Relu
Object Detection, now mainly MobileNet-SSD model. The mainly related operators are:
- Conv2D, DepthwiseConv, Pool2D, BatchNorm, PriorBox, BipartiteMatch, TargetAssign, IouSimilarity, BoxCoder, DetectionMap, MineHardExamples.
OCR Recognition: now mainly CRNN CTC model, The mainly related operators are:
- Conv2D, Pool2D, BatchNorm, GRU, Fc, WarpCTC, CTC greedy decoder, EditDistance

mrysztow commented 6 years ago

@qingqing01 thank you for the list Does SE-ResNeXt is going to replace classic Resnet50, already implemented for Fluid (https://github.com/dzhwinter/benchmark/blob/master/fluid/resnet50.py) ?

qingqing01 commented 6 years ago

@mrysztow Both two networks are classic. This two are all needed. They have the same basic operators.

kuke commented 6 years ago

For the application in speech, we are now developing a recognition system DeepASR. The two important operators used are Conv1D and LSTMP.

In Q2, we plan to implement a wake-up system, the main structure is also CNNs+RNNs.

mrysztow commented 6 years ago

@luotao1 is https://github.com/PaddlePaddle/Paddle/blob/develop/benchmark/fluid/machine_translation.py implementation of Conv seq2seq, mentioned earlier in this thread (https://github.com/PaddlePaddle/Paddle/issues/8450#issuecomment-368200533)? Or it is another seq2seq model?

luotao1 commented 6 years ago

@mrysztow https://github.com/PaddlePaddle/Paddle/blob/develop/benchmark/fluid/machine_translation.py is seq2seq model. And conv seq2seq is in https://github.com/PaddlePaddle/models/pull/686.

mrysztow commented 6 years ago

Thank you

luotao1 commented 6 years ago

Tier 1	Tier 2	Tier 3
ResNet50	MobileNet-SSD	Conv seq2seq(PR)
Transformer	RNN Search(PR)	CRNN CTC
SE-ResNeXt
DeepASR

luotao1 commented 6 years ago

Feeds support :

The mainly related operators are:

FC, GRU, LSTM, Self-Attention

mrysztow commented 6 years ago

@luotao1 I would like to confirm, that the current priority list is, considering current CPU deployments and feeds model, would be the following:

Tier 1	Tier 2	Tier 3	Tier 4
ResNet50	text_classification	MobileNet-SSD	Conv seq2seq(PR)
CRNN-CTC	transformer	SE-ResNeXt
	language_model	DeepASR
	chinese_ner	RNN Search

luotao1 commented 6 years ago

@mrysztow The priority list is OK now.

PaddlePaddle / Paddle

FLUID models roadmap #8450