Open jcasas00 opened 3 years ago
This is a hard limitation in HCL right now. As a workaround, you can use the set bit/set slice APIs. Examples can be found here. For the usage, please check this PR: #291.
Tried to break-up the constant (into 32-bits) and use the set bit/set slice APIs as follows:
def testme(A):
def doit(x):
x = 0xFA_FF00_FFFF
v = hcl.scalar(0, "v", dtype=hcl.UInt(64))
v[0][31:0] = (x >> 0) & 0xFFFF_FFFF # break-up into 32-bit chunks ...
v[0][63:32] = (x >> 32) & 0xFFFF_FFFF
return v.v
return hcl.compute(A.shape, lambda x: doit(x), "doit", dtype=hcl.UInt(64))
A = hcl.placeholder((2,), "A", dtype=hcl.UInt(16))
s = hcl.create_schedule([A], testme)
print(hcl.lower(s))
m = hcl.build (s)
hcl_A = hcl.asarray([0xA0A0,0xA0], dtype=A.dtype)
hcl_R = hcl.asarray([99,99], dtype=hcl.UInt(64))
m (hcl_A, hcl_R)
print(f"hcl_R = {[hex(i) for i in hcl_R.asnumpy()]}")
The schedule looks okay:
produce v {
// attr [0] extern_scope = 0
for "stage_name"="v" (x, 0, 1) {
v[x] = (uint64)0
}
}
v[0] = v[0][31:0].set(-16711681) <-- looks like this still sign-extends the assignment (to all 64-bits despite the slice spec).
v[0] = v[0][63:32].set(250) <-- expected this to clear bit-63 but doesn't
doit[x] = v[0]
But the result has the sign-bit set for some reason:
hcl_R = ['0x800000faff00ffff', '0x800000faff00ffff']
<-- bit 63 is set??
That's because in our current implementation, we take in numbers as int32. So it got sign-extended. For the API, I know it might be a bit counter-intuitive for Python users, you need to write
v[0][32:0] = (x >> 0) & 0xFFFF_FFFF
v[0][64:32] = (x >> 32) & 0xFFFF_FFFF
Note the upper bound differences.
Hmm -- yes, the use of 1 more upper bit is odd. Is this documented somewhere? Will give this a try.
A perhaps related question -- is this the same issue with this code :
hcl.asarray([[1, 1085102592571150095], [13, 14106333703424951235]], dtype=hcl.UInt(64)) <tvm.NDArray shape=(2, 2), cpu(0)> array([[4607182418800017408, 4876868561191968286], [4623507967449235456, 4893293453950613624]], dtype=uint64)
v = hcl.scalar(0xFF_0000_0000, "v", dtype=hcl.UInt(64))
@seanlatias as a workaround, can we initialize a 64b scalar using a string constant?
v[0] = (uint64)18446744073709551615 <--- all 64-bits=1. The 0xFFFF_FFFF in the HCL code seems to be interpreted as -1 (32-bit), then sign-extended to 64-bits
This is indeed very counterintuitive. Let's find a way to fix this issue.
v = hcl.scalar(0xFF_0000_0000, "v", dtype=hcl.UInt(64))
@seanlatias as a workaround, can we initialize a 64b scalar using a string constant?
Sure. Maybe this is something @zzy82518996 can work on?
I will hack into that to see if I can solve this issue.
Code:
Output: // attr [_top] storage_scope = "global" allocate _top[uint64 1] produce _top { // attr [0] extern_scope = 0 produce doit { // attr [0] extern_scope = 0 for "stage_name"="doit" (x, 0, 2) { // attr [v] storage_scope = "global" allocate v[uint64 1] produce v { // attr [0] extern_scope = 0 for "stage_name"="v" (x, 0, 1) { v[x] = (uint64)0 <--- why 0? Expecting 0xFF_0000_0000 (40-bits) } } v[0] = (uint64)18446744073709551615 <--- all 64-bits=1. The 0xFFFF_FFFF in the HCL code seems to be interpreted as -1 (32-bit), then sign-extended to 64-bits v[0] = (uint64)15 <--- looks like only taking lower 32-bits (consistent with first case). doit[x] = v[0] } } }