Samsung / ONE

On-device Neural Engine
Other
435 stars 157 forks source link

[onert][gen_golden] Gather operation asserts for cpu backend #4750

Open dr-venkman opened 4 years ago

dr-venkman commented 4 years ago

The Gather op is asserting for the cpu backend, when running the natural language answering model located here. Please find the asserting line below:

Package Filename ../***/lite-model_mobilebert_1_metadata_1/
nnpackage_run: /home/***/ONE/compute/cker/include/cker/operation/Gather.h:66: void nnfw::cker::Gather(const nnfw::cker::GatherParams&, const nnfw::cker::Shape&, const T*, const nnfw::cker::Shape&, const CoordsT*, const nnfw::cker::Shape&, T*) [with T = float; CoordsT = int]: Assertion `coords_data[i] < axis_size' failed.

On debugging, I found the following information:

coords_shape = (1, 384)
input_shape = (2, 512)
axis = 0
axis_size = 2
coords_data[0] = 59

Any ideas on how to troubleshoot this would be appreciated. Thank you in advance.

dr-venkman commented 4 years ago

Could this be a result of a randomly generated int32 value from gen_golden.py? Looking at the script, it seems to generate from np.random.randint(0, 99, this_shape).astype(np.int32). For the particular case above, it needs to be strictly within the range [0, 2). The surprising part though, is that the model worked successfully on acl_cl backend. Does it mean the acl_cl implementation needs safety checks in place? Please correct me if wrong. Thanks in advance.

For now, I have overriden gen_golden.py temporarily (not to be committed) to provide a value within proper range. In general, I wanted to ask whether it is feasible to check for value constraints within gen_golden.py? Please let me know.