Segfault when executing the testsuite

mingodad commented 3 years ago

When trying to execute the testsuite CocoR segfault due to have fwprintf format expecting wchar_t* but the parameter is char* see a hack/patch that allow it to execute the testsuite:

--------------------------------- src/Tab.cpp ---------------------------------
index 731ec12..3ae0116 100644
@@ -100,7 +100,7 @@ int Tab::Num(Node *p) {

 void Tab::PrintSym(Symbol *sym) {
    wchar_t *paddedName = Name(sym->name);
-   fwprintf(trace, L"%3d %14s %ls", sym->n, paddedName, nTyp[sym->typ]);
+   fwprintf(trace, L"%3d %14s %s", sym->n, paddedName, nTyp[sym->typ]);
    coco_string_delete(paddedName);

    if (sym->attrPos==NULL) fwprintf(trace, L" false "); else fwprintf(trace, L" true  ");
@@ -110,7 +110,7 @@ void Tab::PrintSym(Symbol *sym) {
    } else
        fwprintf(trace, L"            ");

-   fwprintf(trace, L"%5d %ls\n", sym->line, tKind[sym->tokenKind]);
+   fwprintf(trace, L"%5d %s\n", sym->line, tKind[sym->tokenKind]);
 }

 void Tab::PrintSymbolTable() {
@@ -343,7 +343,7 @@ void Tab::PrintNodes() {
    Node *p;
    for (int i=0; i<nodes->Count; i++) {
        p = (Node*)((*nodes)[i]);
-       fwprintf(trace, L"%4d %ls ", p->n, (nTyp[p->typ]));
+       fwprintf(trace, L"%4d %s ", p->n, (nTyp[p->typ]));
        if (p->sym != NULL) {
            wchar_t *paddedName = Name(p->sym->name);
            fwprintf(trace, L"%12s ", paddedName);

mingodad commented 3 years ago

Also there is several memory leaks when executing CocoR see attached the output of valgrind checkall.sh.valgrind.out.txt.zip that shows then.

billyquith commented 3 years ago

WRT wide chars here is a version of Coco-R converted to use char*. Are wide chars really necessary? UTF-8 can be used for localisation?

mingodad commented 3 years ago

Some grammars like CSharp allow utf-8 identifiers, how would you parser then ?

billyquith commented 3 years ago

Some grammars like CSharp allow utf-8 identifiers, how would you parser then ?

Yes, the parser would need modifying for that. - I have only used Coco-R in multibyte/char* applications with "ASCII".

Does the parser work for UTF-16 with >1 code per encoding? If not it doesn't support the UTF-16 properly anyway?

mingodad commented 3 years ago

I've add a simple AST generation and fixes several memory leaks (still there is others) and would be nice if someone could try it and give feedback, it's based on the implementation done here https://github.com/rochus-keller/EbnfStudio.

Here is the skeleton to try it:

../Coco -frames .. gmpl2sql.atg
g++ -g -Wall -DPARSER_WITH_AST -o gmpl2sql Parser.cpp Scanner.cpp gmpl2sql.cpp

#include "Parser.h"
#include "Scanner.h"

using namespace Gmpl2Sql;

int main (int argc, char *argv[]) {

    if (argc == 2) {
        wchar_t *fileName = coco_string_create(argv[1]);
        Gmpl2Sql::Scanner scanner(fileName);
        Gmpl2Sql::Parser parser(&scanner);
        parser.Parse();
                if(parser.errors.count == 0) {
#ifdef PARSER_WITH_AST
                    if(parser.ast_root) {
                        parser.ast_root->dump();
                        //parser.ast_root->dump2(parser.maxT);
                    }
#endif
                }

        coco_string_delete(fileName);
    } else
        wprintf(L"-- No source file specified\n");

    return 0;

}

SSW-CocoR / CocoR-CPP

Segfault when executing the testsuite #1