I've identified two fairly serious bugs with garbage collection as it relates to u8vectors. I expect I will have fixed these in my fork some time this week, but the fork has drifted far enough from this repo that PRing a fix is probably more work than I want to do.
First issue:
in the code path where no vectors have yet been allocated compact() places free_vec_pointer in an invalid state:
https://github.com/stamourv/picobit/blob/ece8fde5f0a395a3f968de72ad3a57ff0c848229/vm/core/gc.c#L283-L332
This follows from init_ram_heap() sets free_vec_pointer=VEC_TO_RAM_OBJ(MIN_VEC_ENCODING) (aka 8192). Then when no vectors are allocated, the while loop never executes, and prev is never changed from 0 (an invalid encoding), and free_vec_pointer is set to 0 (which in subsequent code causes underflows when VEC_TO_RAM_OBJ is used.
Patch: wrap the final update in if (prev)
Second issue:
mark() does not seem to properly handle u8vectors. "1 field" objects are treated as though they always store an object reference in their "car"; however, u8vectors store an unencoded length in their car which gets pushed as visit (~corollary: there doesn't appear to be any code path to mark the header of the actual vector storage as marked, btw~ edit: they get a permanent mark when vec cells are created until their owner is swept) , and then interpreted as an object reference. so, for example, in a program with no globals and a single u8vector, the length field gets pushed as visit and a length of 1280 (0x0500) will end up being a self-reference, ~and a length of 8096 (0x2004) references into the vector's own data bytes, letting you construct arbitrary object references to be traversed by the garbage collector.~ (edit: can't do this because limited to 13 bit lengths)
Patch: this is more involved, still working on it, but u8vectors probably need a special code-path separate from 1-field objects.
I've identified two fairly serious bugs with garbage collection as it relates to
u8vector
s. I expect I will have fixed these in my fork some time this week, but the fork has drifted far enough from this repo that PRing a fix is probably more work than I want to do.First issue:
compact()
placesfree_vec_pointer
in an invalid state: https://github.com/stamourv/picobit/blob/ece8fde5f0a395a3f968de72ad3a57ff0c848229/vm/core/gc.c#L283-L332 This follows frominit_ram_heap()
setsfree_vec_pointer=VEC_TO_RAM_OBJ(MIN_VEC_ENCODING)
(aka 8192). Then when no vectors are allocated, the while loop never executes, andprev
is never changed from0
(an invalid encoding), andfree_vec_pointer
is set to0
(which in subsequent code causes underflows whenVEC_TO_RAM_OBJ
is used. Patch: wrap the final update inif (prev)
Second issue:
mark()
does not seem to properly handle u8vectors. "1 field" objects are treated as though they always store an object reference in their "car
"; however, u8vectors store an unencoded length in theircar
which gets pushed asvisit
(~corollary: there doesn't appear to be any code path to mark the header of the actual vector storage as marked, btw~ edit: they get a permanent mark when vec cells are created until their owner is swept) , and then interpreted as an object reference. so, for example, in a program with no globals and a single u8vector, the length field gets pushed asvisit
and a length of1280
(0x0500) will end up being a self-reference, ~and a length of 8096 (0x2004) references into the vector's own data bytes, letting you construct arbitrary object references to be traversed by the garbage collector.~ (edit: can't do this because limited to 13 bit lengths) Patch: this is more involved, still working on it, but u8vectors probably need a special code-path separate from 1-field objects.