alliedmodders / sourcepawn

A small, statically typed scripting language.
Other
369 stars 63 forks source link

Hex Escape \x consuming up to 3 characters? #963

Closed DosMike closed 6 months ago

DosMike commented 6 months ago

Writing a PrintToChat all with an RGB color escape \x07RRGGBB i notices that the color code was broken. (char printed an ascii character followed by RGGBB)

https://github.com/alliedmodders/sourcepawn/blob/d59a51b5741823903ecbe8c014632ee1f8aad65d/compiler/lexer.cpp#L2140

I think this exit condition is off-by-1 as digits is the amount of valid parsed characters, and the check should be digits >= 2

Checked it in [compiler explorer](https://godbolt.org/#g:!((g:!((g:!((h:codeEditor,i:(filename:'1',fontScale:14,fontUsePx:'0',j:1,lang:c%2B%2B,selection:(endColumn:1,endLineNumber:4,positionColumn:1,positionLineNumber:4,selectionStartColumn:1,selectionStartLineNumber:4,startColumn:1,startLineNumber:4),source:'%23include+%3Ccctype%3E%0A%23include+%3Ciostream%3E%0A%23include+%3Ciomanip%3E%0A%0Astatic+char+stream%5B5%5D+%3D+%22123-%22%3B%0Astatic+int+at+%3D+0%3B%0Achar+peek()+%7B%0A++++return+stream%5Bat%5D%3B%0A%7D%0Achar+advance()+%7B%0A++++return+stream%5Bat%2B%2B%5D%3B%0A%7D%0Abool+ishex(char+c)+%7B+return+std::isxdigit(c)%3B+%7D%0Abool+IsDigit(char+c)+%7B+return+std::isdigit(c)%3B+%7D%0A%0Aint+hexparse()+%7B%0A++++//prep+vars%0A++++int+digits+%3D+0%3B%0A++++int+c+%3D+0%3B%0A++++//lexer+code:%0A++++while+(true)+%7B%0A++++++++char+ch+%3D+peek()%3B%0A++++++++if+(!!ishex(ch)+%7C%7C+digits+%3E%3D+2)%0A++++++++++++break%3B%0A++++++++if+(IsDigit(ch))%0A++++++++++++c+%3D+(c+%3C%3C+4)+%2B+(ch+-+!'0!')%3B%0A++++++++else%0A++++++++++++c+%3D+(c+%3C%3C+4)+%2B+(tolower(ch)+-+!'a!'+%2B+10)%3B%0A++++++++advance()%3B%0A++++++++digits%2B%2B%3B%0A++++%7D%0A++++%0A++++return+c%3B%0A%7D%0A%0Aint+main()+%7B%0A++++std::cout+%3C%3C+%22Parsed+%22+%3C%3C+std::hex+%3C%3C+hexparse()+%3C%3C+%22%5Cn%22%3B%0A%7D'),l:'5',n:'0',o:'C%2B%2B+source+%231',t:'0')),k:50,l:'4',n:'0',o:'',s:0,t:'0'),(g:!((h:executor,i:(argsPanelShown:'1',compilationPanelShown:'0',compiler:g132,compilerName:'',compilerOutShown:'0',execArgs:'',execStdin:'',fontScale:14,fontUsePx:'0',j:1,lang:c%2B%2B,libs:!(),options:'-fpermissive',overrides:!(),runtimeTools:!(),source:1,stdinPanelShown:'1',tree:0,wrap:'1'),l:'5',n:'0',o:'Executor+x86-64+gcc+13.2+(C%2B%2B,+Editor+%231)',t:'0')),k:50,l:'4',n:'0',o:'',s:0,t:'0')),l:'2',n:'0',o:'',t:'0')),version:4)

#include <cctype>
#include <iostream>
#include <iomanip>

static char stream[5] = "123-";
static int at = 0;
char peek() {
    return stream[at];
}
char advance() {
    return stream[at++];
}
bool ishex(char c) { return std::isxdigit(c); }
bool IsDigit(char c) { return std::isdigit(c); }

int hexparse() {
    //prep vars
    int digits = 0;
    int c = 0;
    //lexer code:
    while (true) {
        char ch = peek();
        if (!ishex(ch) || digits >= 2)
            break;
        if (IsDigit(ch))
            c = (c << 4) + (ch - '0');
        else
            c = (c << 4) + (tolower(ch) - 'a' + 10);
        advance();
        digits++;
    }

    return c;
}

int main() {
    std::cout << "Parsed " << std::hex << hexparse() << "\n";
}

Also, i don't know why hex escapes swallow a trailing semicolon. I guess that could be used to force 2 hexits, but idk if that's intended - seems kinda random and obscure to me.

https://github.com/alliedmodders/sourcepawn/blob/d59a51b5741823903ecbe8c014632ee1f8aad65d/compiler/lexer.cpp#L2149

psychonic commented 6 months ago

It looks like you are viewing an older commit. I believe this is a duplicate of https://github.com/alliedmodders/sourcepawn/issues/909, fixed in https://github.com/alliedmodders/sourcepawn/pull/926

DosMike commented 6 months ago

ah you're right... idk how i got there