adrian-thurston / colm

The Colm Programming Language
MIT License
164 stars 31 forks source link

huge array of test case failures on ppc64 and sparc64 #126

Closed adrian-thurston closed 3 years ago

adrian-thurston commented 3 years ago

Believe this to be the cause of adrian-thurston/ragel#61.

Approximately 107 / 170 test cases fail. One segfault. Looking at forloop1, as an example:

lex
        token id / 'a' .. 'z' /
        ignore / '\n' | '\t' | ' ' /
end

def start
        [id*]

parse P: start[stdin]
Start: start = P
for Id: id in Start
        print( ^Id, '\n' )

With input a b c this prints three empty lines. Removing the tree trim this succeeds as expected.

While stepping through the print I noticed that pushing non-pointer types seems to be broken. Pushing:

vm_push_type( enum ReturnType, CollectIgnoreLeft );

Comes back as 0 when the corresponding pop is issued.

399             rt = vm_pop_type(enum ReturnType);
(gdb) print sp
$110 = (tree_t **) 0x10080128
(gdb) n
400             switch ( rt ) {
(gdb) print rt
$111 = 0
(gdb) print sp

Looks like the pop macro has problems. The source of the problem is taking the value from the stack using it's native pointer type and saving it in a local, then casting the local value to the requested type. This does not correspond to the push, where the stack pointer is cast and we copy it in using a pointer to the requested type.

Changing the pop to cast the pointer and take the value using a pointer to the requested type, then returning that clears up all the test case failures.

adrian-thurston commented 3 years ago

Explanation: Enums are 4 bytes, meanwhile the pointers 8. During push, we write the following significant bytes of the enum to memory (a3 is highest significance, written to the lowest address, and so on).

a3 a2 a1 a0

During reading we read back:

b7 b6 b5 b4 b3 b2 b1 b0

Then when we cast it we reduce it to

b3 b2 b1 b0

The original bytes are lost. We are actually reading back random data.

adrian-thurston commented 3 years ago

Problem shown here:

#include <stdio.h>

struct tree_t
{
        long i;
        void *p;
};

typedef struct tree_t *SW;

#define vm_push_type(type, i) \
        ( (*((type*)(--sp)) = (i)) )

#define vm_pop_type1(type) \
       ({ SW r = *sp;  (sp += 1); (type)r; })

#define vm_pop_type2(type) \
        ({ type r = *((type*)sp); (sp += 1); r; })

enum small_enum
{
        E_A = 1,
        E_B,
        E_C,
        E_D,
};      

char stack[8192];

int main()
{
        struct tree_t **sp = (struct tree_t**)( stack + 8192 - 8 );

        vm_push_type( enum small_enum, E_C );
        vm_push_type( struct tree_t*, 0 );

        vm_pop_type1( struct tree_t* );
        enum small_enum e = vm_pop_type1( enum small_enum );

        printf("%lu %lu %d\n", sizeof(SW), sizeof(enum small_enum), e);

        return 0;
}