Samsung / ONE

On-device Neural Engine
Other
426 stars 151 forks source link

[circle+] Implement circle+(training) parameter #11692

Open zetwhite opened 11 months ago

zetwhite commented 11 months ago

Let's implement a training param importer that gets hyper-parameters from the circle+ file.

(notes) For now, onert_train gets hyper-parameters from command line options.

zetwhite commented 10 months ago

(Progress Updates)

Now, the draft (https://github.com/Samsung/ONE/pull/11740) works well with this file.

➜ ~/Workspace/ONE/Product/out/bin/onert_train --modelfile mnist_with_meta.circle --load_input:raw pool_mnist_model/mnist_x_train.1000.bin --load_expected:raw pool_mnist_model/mnist_y_train.1000.bin --data_length 1000 -v 100                                

Model Expected Filename pool_mnist_model/mnist_y_train.1000.bin
Model Input Filename pool_mnist_model/mnist_x_train.1000.bin
Model Filename mnist_with_meta.circle
Epoch 1/2 - 50.340ms/step - loss: [0] 0.2309
Epoch 2/2 - 50.130ms/step - loss: [0] 0.2185
===================================
MODEL_LOAD   takes 1.486 ms
PREPARE      takes 8.617 ms
EXECUTE      takes 1508.700 ms
- MEAN     :  1508.700 ms
- MAX      :  1508.700 ms
- MIN      :  1508.700 ms
- GEOMEAN  :  1508.700 ms

For next step,

zetwhite commented 10 months ago

(additionally) In offline, I asked @jyoungyun for advice on the current implementation details(in draft https://github.com/Samsung/ONE/pull/11740). Since she gave me a much better idea, I have a plan to fix a current implementation.

zetwhite commented 9 months ago

(Progress Updated)

I cleaned the code from previous draft and upload it as version-2 ( https://github.com/Samsung/ONE/pull/12045 ). You can check it is working well with the file : mnist.zip

➜ ~/Workspace/ONE/Product/out/bin/onert_train --modelfile mnist.circle+ --load_input:raw pool_mnist_model/mnist_x_train.1000.bin --load_expected:raw pool_mnist_model/mnist_y_train.1000.bin --data_length 1000 -v 100
Model Expected Filename pool_mnist_model/mnist_y_train.1000.bin
Model Input Filename pool_mnist_model/mnist_x_train.1000.bin
Model Filename mnist.circle+
== Training Paramter (from model file) ============
batch size    : 64
learning rate : 0.001
loss func     : 0(mean_squared_error)
optimizer     : 0(sgd)
================================================
Epoch 1/5 - time: 38.330ms/step - loss: [0] 0.2106
Epoch 2/5 - time: 38.244ms/step - loss: [0] 0.1995
Epoch 3/5 - time: 38.249ms/step - loss: [0] 0.1900
Epoch 4/5 - time: 38.191ms/step - loss: [0] 0.1818
Epoch 5/5 - time: 38.311ms/step - loss: [0] 0.1747
===================================
MODEL_LOAD   takes 1.4870 ms
PREPARE      takes 10.8400 ms
EXECUTE      takes 2873.9660 ms
- Epoch 1      takes 574.9520 ms
- Epoch 2      takes 573.6640 ms
- Epoch 3      takes 573.7380 ms
- Epoch 4      takes 572.8630 ms
- Epoch 5      takes 574.6670 ms
===================================
zetwhite commented 8 months ago

(Progress Updated)

Draft v3 is ready - https://github.com/Samsung/ONE/pull/12152. I made a note about API usage example here - https://github.com/Samsung/ONE/pull/12152#issuecomment-1833214304.

For now, I'm quite satisfied current implementation. Because it touches many parts, It is a bit hard to get a detailed review. I'd better make a small-size PR and get a detailed review on each one.

PR plan

zetwhite commented 8 months ago

Discussion Point

This comment is for logging :smile:

Background

Model parameters (we usually call TrainInfo) can be given 2 ways.

Have to decide

If a model parameter is not given in both ways, How should we handle?

Conclusion

I asked the others opinion(@jyoungyun , @chunseoklee , @hseok-oh , @ragmani ) on offline. In offline discussion, we prefer "[1] throw an error" over "[2] provide default parameter".

zetwhite commented 7 months ago

Additional things to change

The list of unknown issues - what I've known through code review. Let's handle it one by one :)