Samsung / ONE

On-device Neural Engine
Other
414 stars 144 forks source link

[onert][exporter] Trainable tensor index is out of range is called if ConstantInsertionPass applied #13221

Closed mbencer closed 1 month ago

mbencer commented 1 month ago

Steps to reproduce:

Model: mobilenet v2 Train data: mobilenet data

Command:

onert_train customized_mobilenetv2.circle --epoch 1 --loss 1 --loss_reduction_type 1 --learning_rate 0.00001 --batch_size 32 --load_expected:raw cats_and_dogs.output.bin --load_input:raw cats_and_dogs.input.bin --optimizer 1 --trainable_ops_idx 68-69 --export_path customized_mobilenetv2_trained.circle

Current state: The exception Trainable tensor index is out of range is called because ConstantInsertionPass called during trainable graph lowering change the operands values. In my mobilenet v2 case ConstantInsertionPass is called for Pad operator. The new operand with value 182 is created in trainable graph. This operand does NOT exist in the origin model loaded by CircleExporter.

Ideas for fix:

  1. Run the same passes also for origin model before updateWeight ??
  2. Store somehow change operands ??

//cc @jyoungyun

jyoungyun commented 1 month ago

I fixed this issue like this.

It has a known issue. The created circle file contains two buffers with the same data becasue this PR just adds a new buffer without removing the existing buffer. However, it is a ONERT's problem. ONERT should not add a new operand if the data has not changed. Apart from this issue, when the tensor is added to a trainable graph, it should be created in a newly generated circle file.