Open leandron opened 1 year ago
hi--if no one has taken this yet, can I give this a shot?
hi--if no one has taken this yet, can I give this a shot?
Absolutely @p3achyjr. Each operator can be submitted individually, so we have self contained, easier to review PRs.
Let us know if you need any help. Thanks!
@leandron I see that the code has changed a lot from the example#9165. All tests are all combined to one function, so if I just need to remove the conditional block and test it? Besides, because I am totally new to it, I find it a little bit tough to have a test, can you offer me some simple ways?
@tlopex if you're working on this too, would you like to divide up the operators so we don't do redundant work?
@p3achyjr I'm glad to do that. What operators tasks have you doing now?
I won't be able to work on this for the next couple weeks, so feel free to take anything :)
@leandron I see that the code has changed a lot from the example#9165. All tests are all combined to one function, so if I just need to remove the conditional block and test it? Besides, because I am totally new to it, I find it a little bit tough to have a test, can you offer me some simple ways?
Maybe https://github.com/apache/tvm/pull/14667 can be an updated version of the way to support new operators? Have a look on this one and let me know.
@p3achyjr So am I. I think I won't have time to do this work until the first few days of September. If you start to work on it, just inform me. Thanks.
@leandron I have seen #14667 , which is a good example for us to mimic. However, it is not the latest version of code.
I'm back now. I can take a look at FLOOR_DIV.
@p3achyjr Okay, got it. I am now trying to work on SQUARE.
Finally getting around to this--a draft PR is at https://github.com/apache/tvm/pull/15724
Got it. Actually I don't know clearly how to write a test for unary element operation... And I will try again in following days.
@tlopex maybe you can try something like this first? https://github.com/apache/tvm/blob/e3055c19702ff58819fa064565a6e1aa1443818e/tests/python/frontend/tflite/test_forward.py#L2249-L2251
I'm happy to keep working on the elemwise ones :)
@p3achyjr Thanks. You set me a really great example. I'll try it again.
It looks like this may be causing some flaky CI failures. I'm noticing some failures in tests/python/frontend/tflite/test_forward.py
(link) on unrelated PRs. These are occurring in the _test_floor_mod
function, added as part of https://github.com/apache/tvm/pull/15733,
Confirmed, when running test_all_elem
with only _test_forward_elemwise_quantized(_test_floor_mod)
enabled, the test fails about 50% of the time. (44 failures out of 100 iterations).
I'm not able to create a revert--has one already been created?
@Lunderberg what error messages do you see when the tests fail? I wonder if the range should be (0,150) since mod can't be negative, and whether that's causing the failures.
The error message is below (full test output here). Unfortunately, this doesn't indicate a clear location where it occurs, only that the final result was mismatched.
AssertionError:
Not equal to tolerance rtol=1e-05, atol=1
Mismatched elements: 1 / 18 (5.56%)
Max absolute difference: 129
Max relative difference: 0.50588235
x: array([[181, 175, 227, 191, 159, 182],
[135, 188, 128, 206, 131, 148],
[234, 173, 165, 200, 165, 233]], dtype=uint8)
y: array([[181, 175, 227, 191, 159, 182],
[135, 188, 255, 206, 131, 148],
[234, 173, 165, 200, 165, 233]], dtype=uint8)
Between the mismatches having suspicious 255
expected values, and this comment, I agree that something is incorrect with the range of inputs/outputs. Testing different permutations (e.g. excluding negatives, excluding zero) in _test_elemwise_qnn_out_range
didn't result in any combinations that became a valid test, so I suspect it won't be a simple numerical change.
My current guess is that there's something going on with the data generation. Currently, _test_forward_elemwise_quantized
generates random uint8
data for the test case, but that could produce zero for the RHS of floormod. That would produce undefined behavior, with different results based on different optimizations.
I see what you're saying--maybe we can add min/max overrides for _test_forward_elemwise_quantized
. I'm surprised that div
and floor_div
aren't failing in this case though, since the rhs can generate 0s :/.
May I ask how you're running these tests multiple times? This seems to just tell you how to run it one time.
I see what you're saying--maybe we can add min/max overrides for
_test_forward_elemwise_quantized
.
That's what I'm thinking as well. It looks like it currently uses the same range for both quantization and for data generation. I think it will need to override the data generation range to exclude zero from the denominator, but to keep zero in the quantization range as zero may occur in the output.
I'm surprised that
div
andfloor_div
aren't failing in this case though, since the rhs can generate 0s :/.
Agreed, as I would expect the same problem to effect any operator with a restricted domain. My guess is that there's some optimization that assumes the inputs to be valid (a legal assumption, as the output is typically undefined when the denominator is zero), and that that optimization is affecting floormod differently from floordiv. It probably would be good to track that optimization down at some point, if it occurs at the TVM level, but I don't think that should delay the re-enabling of the unit test.
May I ask how you're running these tests multiple times?
It's a bit of a hacky way to do so. I commented out everything in test_all_elemwise
except for the _test_forward_elemwise_quantized(_test_floor_mod)
line, then added a parametrized pytest fixture to the file. When running pytest as usual (python3 -mpytest -sv tests/python/frontend/tflite/test_forward.py::test_all_elemwise
), it then repeats every test the number of times specified.
import pytest
@pytest.fixture(params=list(range(100)), autouse=True)
def repeat_all_tests(request):
return request.param
I suppose I could have just made a for loop, but I was lazy and this let me use pytest's pass/fail counter instead of making my own :P.
I ran some tests--it looks like div
, floor_div
, and floor_mod
all fail. Here is an example of an offending input:
Fail.
Input Node: ['inq_1']
Input Data: [array([[ 12, 215, 181, 106, 92, 19],
[ 58, 160, 194, 62, 231, 143],
[ 24, 128, 91, 219, 69, 68]], dtype=uint8)]
TFLite: [[127 129 130 128 125 128]
[127 131 129 127 129 135]
[127 0 125 128 127 126]]
TVM: [[127 129 130 128 125 128]
[127 131 129 127 129 135]
[127 255 125 128 127 126]]
Raw Data: [array([[152, 88, 238, 4, 202, 53],
[ 98, 228, 85, 95, 173, 253],
[247, 65, 180, 60, 78, 172]], dtype=uint8), array([[ 12, 215, 181, 106, 92, 19],
[ 58, 160, 194, 62, 231, 143],
[ 24, 128, 91, 219, 69, 68]], dtype=uint8)]
so at the index where the elements diverge (-5), the input data is 128
. Presumably, this is the zero-point for quantization, so min/max overrides won't work.
My question is whether 128
is always the zero point, or if we determine it some other way. If it is, we can just prevent 128 in the RHS. @Lunderberg @leandron do either of you know?
@p3achyjr
# with respect to its fp32 input range, defined in fake_quant.
# s = 255/(fmax-fmin); m = -fmin*s (the zero point)
for i in input_arrays:
try:
quant_scale = 255 / (input_range[i][1] - input_range[i][0])
except ZeroDivisionError:
print("Min and max of the input range for tensor " + i + " can't be equal")
mean = -input_range[i][0] * quant_scale
input_stats[i] = (mean, quant_scale)
Here, if the range is symmetrical, the zero point will always be 128.
Thanks @tlopex!
I'm working on a draft PR to regenerate the input values if any of them are at the zero point. Hopefully I can have it up soon.
@p3achyjr Rather than regenerating, would it be easier to alter the generated values? That would avoid potentially regenerating data over and over.
# Assuming we know that the operator is mod/div/floormod/floordiv,
# and the generated denominator is in a np.ndarray named "denominator".
denominator[denominator==0] += 1
Thanks for that! sorry not a numpy guy.
@p3achyjr May I ask you how did you run the tests. Why I use some commands like python3 -mpytest -sv tests/python/frontend/tflite/test_forward.py::test_all_elemwise
always get passed even I used offending input you gave. It bothers me for a while.
@tlopex @Lunderberg had a comment above^^
import pytest
@pytest.fixture(params=list(range(100)), autouse=True)
def repeat_all_tests(request):
return request.param
put that at the bottom of the test file and then run the docker script as usual.
@p3achyjr Thanks.
I found a couple PRs by @tlopex #15992 and #15915 that were suffering from CI flakiness in past. So I'm retying them again.
In https://github.com/apache/tvm/issues/9187 we implemented quantised version of operators in TFLite frontend. Recently, I just noticed a few more operators (with varying priorities) that can be taken as beginner friendly tasks, for someone who's starting in TVM.
I'm creating this as a tracking issue to mark those operators that need to have quantization support implemented and tested:
FLOOR_DIV
(#15724)FLOOR_MOD
(#15733)DIV
(#15768)POW
(#15798)REVERSE_SEQUENCE
(#15915)SQUARE
(#15914)ELU
(#15821)LOCAL_RESPONSE_NORMALIZATION
(LRN)L2_NORMALIZATION
MIRROR_PAD
(#16243)ARG_MIN
(#15922)NOT_EQUAL
(#15769)LESS
(#15746)LESS_EQUAL
(#15790)GREATER_EQUAL
(#15775)For each operator above, there are actions to be taken:
if self.is_quantized(op): ...
https://github.com/apache/tvm/pull/9165 can be used as an example.
In case you're interested, a good starting point is the TFLite frontend: https://github.com/apache/tvm/blob/bee073b0c8e8625216184a2dbb0204c0a376fc26/python/tvm/relay/frontend/tflite.py#L78-L185