sysprog21 / shecc

A self-hosting and educational C optimizing compiler
BSD 2-Clause "Simplified" License
1.11k stars 118 forks source link

Parse syntax for include macro in parser.c #104

Closed cratuki closed 8 months ago

cratuki commented 9 months ago

Concern about code from line 3288 in parser.c,

    if (lex_peek(T_include, token)) {
        if (!strcmp(token_str, "<stdio.h>")) {
            /* ignore, we include libc by default */
        }
        lex_expect(T_include);

If the comment is valid, the lex_expect line should be inside the if block. As it is, stdio.h include lines will be processed by a lex_expect call.

If the comment is valid, here is a change that would put the lex_expect is put in an else block, and removing negation on the strcmp clause,

    if (lex_peek(T_include, token)) {
        if (strcmp(token_str, "<stdio.h>")) {
            /* ignore, we include libc by default */
        } else {
            lex_expect(T_include);
        }
vacantron commented 8 months ago

In shecc lexer, the whole line with the compler directive include is regarded as a single "include statement" instead of the combination of "include keyword + identifier (file path)".

https://github.com/sysprog21/shecc/blob/ba161c58c1a39483b2dcc620bf8574b240f4dcfc/src/lexer.c#L255-L261

For example, the line #include "example.c"\n after lexer reading will be #include"example.c" and report it is a "include" token, and the string will be stored in the token_str. So, the correct change in the parser should be:

    if (lex_peek(T_include, token)) {
        if (!strcmp(token_str, "#include<stdio.h>")) {  /* or we could reuse the variable "token" to get rid of the global variable "token_str" */
            /* ignore, we include libc by default */
        }
        lex_expect(T_include);
}

If the comment is valid, the lex_expect line should be inside the if block. As it is, stdio.h include lines will be processed by a lex_expect call.

Actually, this part of code is only to skip the include statement. The actual source file loading is in the load_source_file():

https://github.com/sysprog21/shecc/blob/ba161c58c1a39483b2dcc620bf8574b240f4dcfc/src/parser.c#L2620-L2634

jserv commented 8 months ago

@cratuki, Unless there are additional concerns, I am inclined to consider this issue resolved.

cratuki commented 8 months ago

1) I understand why my suggested fix is not valid. @vacantron has explained that the purpose of the outer if (lex_peek) is to skip the include line.

2) In this case, I suggest that the if statement containing strcmp should be removed, and its comment also. Function strcmp is a pure function, and there is no content within the if clause. Hence, it would improve the code to remove it. Shall I raise a PR?

cratuki commented 8 months ago

I do not have permissions to raise a PR.

The quote below shows what I suggest,

m75-office-001$ git show
commit e33a621376da1280ae0d986975434af56587986e (HEAD -> 20240115.cturner.simplify.parser, fork/20240115.cturner.simplify.parser)
Author: Craig Turner <cratuki@users.noreply.github.com>
Date:   Mon Jan 15 10:49:36 2024 +0100

    Removes redundant block from parser.c

diff --git a/src/parser.c b/src/parser.c
index 569ad89..d544bd2 100644
--- a/src/parser.c
+++ b/src/parser.c
@@ -2386,9 +2386,6 @@ void read_global_statement()
     block_t *block = &BLOCKS[0]; /* global block */

     if (lex_peek(T_include, token)) {
-        if (!strcmp(token_str, "<stdio.h>")) {
-            /* ignore, we include libc by default */
-        }
         lex_expect(T_include);
     } else if (lex_accept(T_define)) {
         char alias[MAX_VAR_LEN];
m75-office-001$ gh pr create -f -r vacantron,ChAoSUnItY,jserv
? Where should we push the '20240115.cturner.simplify.parser' branch? Create a fork of sysprog21/shecc

Creating pull request for cratuki:20240115.cturner.simplify.parser into master in sysprog21/shecc

remote: 
remote: 
To github.com:cratuki/shecc.git
 * [new branch]      HEAD -> 20240115.cturner.simplify.parser
Branch '20240115.cturner.simplify.parser' set up to track remote branch '20240115.cturner.simplify.parser' from 'fork'.
https://github.com/sysprog21/shecc/pull/105
pull request update failed: GraphQL: cratuki does not have the correct permissions to execute `RequestReviews` (requestReviews)
m75-office-001$ 
jserv commented 8 months ago
  1. In this case, I suggest that the if statement containing strcmp should be removed, and its comment also. Function strcmp is a pure function, and there is no content within the if clause. Hence, it would improve the code to remove it. Shall I raise a PR?

As discussed previously on Hacker News, I am a university faculty member, and the primary motivation behind starting this project was to establish a hands-on learning platform for my students. Your pull request presents an excellent opportunity for them to engage in code review and collaborative practices, which I greatly value and appreciate.