ioccc-src / mkiocccentry

Form an IOCCC submission as a compressed tarball file
Other
30 stars 6 forks source link

Bug: unexpected chkentry error and warning messages #1201

Closed lcn2 closed 1 month ago

lcn2 commented 1 month ago

Is there an existing issue for this?

Describe the bug

We a test, we applied chkentry(1) to an IOCC* winning entry to see what would happen without using -w and using the -i foo accordingly:

$ cd winners
$ chkentry -i .entry.json -i README.md -i index.html -i .auth.json -i .info.json -i remarks.md 2014/birken
ERROR[1]: main: .gitignore: not a sane relative path with max path len, filename len and depth of 99, 38, 4
ERROR[1]: main: .gitignore: permissions: 100644 != 0444
ERROR[1]: main: .path: not a sane relative path with max path len, filename len and depth of 99, 38, 4
ERROR[1]: main: .path: permissions: 100644 != 0444
ERROR[1]: main: 2014_birken.tar.bz2: permissions: 100644 != 0444
ERROR[1]: main: Makefile: permissions: 100644 != 0444
ERROR[1]: main: prog.alt.c: permissions: 100644 != 0444
ERROR[1]: main: prog.c: permissions: 100644 != 0444
ERROR[1]: main: prog.orig.c: permissions: 100644 != 0444
Warning: read_fts: failed to chdir("2014/birken"): errno[2]: No such file or directory
Warning: read_fts: failed to chdir("2014/birken"): errno[2]: No such file or directory
ERROR[1]: main: one or more tests failed

FYI:

$ ls -lRa 2014/birken
2014/birken:
total 100
drwxr-xr-x 12 chongo   384 Mar  1 14:56 .
drwxr-xr-x 24 chongo   768 Mar  1 14:55 ..
-rw-r--r--  1 chongo  3211 Jan  9 22:06 .entry.json
-rw-r--r--  1 chongo    88 Jan  9 22:06 .gitignore
-rw-r--r--  1 chongo    12 Jan  9 22:06 .path
-rw-r--r--  1 chongo 23442 Jan  9 22:06 2014_birken.tar.bz2
-rw-r--r--  1 chongo  3619 Jan  9 22:06 Makefile
-rw-r--r--  1 chongo  9428 Jan  9 22:43 README.md
-rw-r--r--  1 chongo 29835 Jan  9 22:58 index.html
-rw-r--r--  1 chongo  4097 Jan  9 22:06 prog.alt.c
-rw-r--r--  1 chongo  4053 Jan  9 22:06 prog.c
-rw-r--r--  1 chongo  4053 Jan  9 22:06 prog.orig.c

What you expect

The "100644 != 0444" would have been reported as "0644 != 0444" or 100644 != 100444".

We would NOT see messages of the form:

Warning: read_fts: failed to chdir("2014/birken"): errno[2]: No such file or directory
Warning: read_fts: failed to chdir("2014/birken"): errno[2]: No such file or directory

This error seems to be incorrect:

ERROR[1]: main: .gitignore: not a sane relative path with max path len, filename len and depth of 99, 38, 4

as the sane relative path does not exceed 99, nor does the filename len does not exceed 38, nor does the depth exceed 4.

BTW: If would be better if, possible, to state the values and what they exceed on separate lines.

Just for an example of a better error message:

ERROR[1]: main: ...something...: not a sane relative path length 156 > 99
ERROR[1]: main: ...something...: not a sane filename len 42 > 38
ERROR[1]: main: ...something...: not a sane depth 5 > 4

That way the user knows what is wrong and how far wrong it is, perhaps allowing them to correct by renaming things, for example.

Environment

chkentry version: 2.0.0 2025-02-28 jparse utils version: 2.0.1 2025-03-01 jparse UTF-8 version: 2.1.0 2025-02-28 jparse library version: 2.3.0 2025-02-28

bug_report.sh output

bug-report.20250301.151754.txt

Anything else?

Also with -w we see errors too:

$ chkentry -w 2014/birken
ERROR[1]: main: .entry.json: permissions 100664 != 0444
ERROR[1]: main: .gitignore: not a sane relative path with max path len, filename len and depth of 99, 38, 4
ERROR[1]: main: .gitignore: permissions: 100664 != 0444
ERROR[1]: main: .path: not a sane relative path with max path len, filename len and depth of 99, 38, 4
ERROR[1]: main: .path: permissions: 100664 != 0444
ERROR[1]: main: 2014_birken.tar.bz2: permissions: 100664 != 0444
ERROR[1]: main: Makefile: permissions 100664 != 0444
ERROR[1]: main: README.md: permissions 100664 != 0444
ERROR[1]: main: index.html: permissions 100664 != 0444
ERROR[1]: main: prog.alt.c: permissions: 100664 != 0444
ERROR[1]: main: prog.c: permissions 100664 != 0444
ERROR[1]: main: prog.orig.c: permissions: 100664 != 0444
Warning: read_fts: failed to chdir("2014/birken"): errno[2]: No such file or directory
Warning: read_fts: failed to chdir("2014/birken"): errno[2]: No such file or directory
ERROR[1]: main: one or more tests failed

When we did this, to make files read only:

$ chmod 0444 2014/birken/* 2014/birken/.?*
$ ls -lRa 2014/birken
2014/birken:
total 100
drwxr-xr-x 12 chongo   384 Mar  1 14:56 .
drwxr-xr-x 24 chongo   768 Mar  1 14:55 ..
-r--r--r--  1 chongo  3211 Jan  9 22:06 .entry.json
-r--r--r--  1 chongo    88 Jan  9 22:06 .gitignore
-r--r--r--  1 chongo    12 Jan  9 22:06 .path
-r--r--r--  1 chongo 23442 Jan  9 22:06 2014_birken.tar.bz2
-r--r--r--  1 chongo  3619 Jan  9 22:06 Makefile
-r--r--r--  1 chongo  9428 Jan  9 22:43 README.md
-r--r--r--  1 chongo 29835 Jan  9 22:58 index.html
-r--r--r--  1 chongo  4097 Jan  9 22:06 prog.alt.c
-r--r--r--  1 chongo  4053 Jan  9 22:06 prog.c
-r--r--r--  1 chongo  4053 Jan  9 22:06 prog.orig.c

This is what happened:

$ chkentry -i .entry.json -i README.md -i index.html -i .auth.json -i .info.json -i remarks.md 2014/birken
ERROR[1]: main: .gitignore: not a sane relative path with max path len, filename len and depth of 99, 38, 4
ERROR[1]: main: .path: not a sane relative path with max path len, filename len and depth of 99, 38, 4
Warning: read_fts: failed to chdir("2014/birken"): errno[2]: No such file or directory
Warning: read_fts: failed to chdir("2014/birken"): errno[2]: No such file or directory
ERROR[1]: main: one or more tests failed

And this:

$ chkentry -w 2014/birken
ERROR[1]: main: .gitignore: not a sane relative path with max path len, filename len and depth of 99, 38, 4
ERROR[1]: main: .path: not a sane relative path with max path len, filename len and depth of 99, 38, 4
Warning: read_fts: failed to chdir("2014/birken"): errno[2]: No such file or directory
Warning: read_fts: failed to chdir("2014/birken"): errno[2]: No such file or directory
ERROR[1]: main: one or more tests failed
xexyl commented 1 month ago

Oh no. Well I have an idea about the directory. As for the permissions that's because we changed that requirement.

Should there be an option to have the check on permissions disabled? Perhaps -p or -P? I like that idea actually.

xexyl commented 1 month ago

And great catch about the dot files! I should go through the ignored files and directory names and add those to the ignore list - at least in winner mode.

But as for the other dot files - I am not sure. Should winner mode disable dot file checks ? I think that might be a good idea.

What do you think?

Should: -p (or if you prefer -P) disable permission checks? And should -w automatically disable dot file checks? I think the latter should be the case and as for -p or -P maybe that should also be disabled for -w? Since after all the judges might want to change the permissions for winning entries. Thus maybe the permission checks and the dot file checks should only be there for the non -w mode.

As for the change of directory I'll test that out now ...

xexyl commented 1 month ago

Oh but I see you did not use -w.

So... I need to know what you think.

I do think or -w it should maybe ignore dot files and permissions too. Since at that point the judges will have already checked permissions (and have done what they wish for it).

But as for permissions let's see what you have here.. well I'll just test so I can see.

xexyl commented 1 month ago

I'll address each issue at a time though not necessarily in order.

First of all the simple one:

ERROR[1]: main: .gitignore: not a sane relative path with max path len, filename len and depth of 99, 38, 4

You say that is incorrect but it's actually not. The error message is saying that there is one or more problems with it without determining which it is (I can if you wish check with a switch statement - perhaps I should?) but the reason it's not a sane relative path is because it's a dot file.

xexyl commented 1 month ago

As for ....

The "100644 != 0444" would have been reported as "0644 != 0444" or 100644 != 100444".

The mode is simply the struct stat st_mode. So perhaps I need to mask out something or add to the message the number? But on that part the problem is knowing the right mask to

QUESTION

Since it's the st_mode how do you wish to be addressed ?

I do have a test on this .. hold please.

lcn2 commented 1 month ago

As for ....

The "100644 != 0444" would have been reported as "0644 != 0444" or 100644 != 100444".

The mode is simply the struct stat st_mode. So perhaps I need to mask out something or add to the message the number? But on that part the problem is knowing the right mask to

QUESTION

Since it's the st_mode how do you wish to be addressed ?

I do have a test on this .. hold please.

Most people would probably know permission of the form "0444" and might not know the extended mode bits. So we recommend:

"0644 != 0444"

lcn2 commented 1 month ago

And great catch about the dot files! I should go through the ignored files and directory names and add those to the ignore list - at least in winner mode.

But as for the other dot files - I am not sure. Should winner mode disable dot file checks ? I think that might be a good idea.

What do you think?

Should: -p (or if you prefer -P) disable permission checks? And should -w automatically disable dot file checks? I think the latter should be the case and as for -p or -P maybe that should also be disabled for -w? Since after all the judges might want to change the permissions for winning entries. Thus maybe the permission checks and the dot file checks should only be there for the non -w mode.

As for the change of directory I'll test that out now ...

The permission checks can be slightly annoying, so a -p might be a good flag to add to chkentry(1) .. however that is enhancement and not really a bug that needs to be fixed.

Better to require someone to chmod the files before running chkentry(1) again .. what is what we ended up doing. :-)

xexyl commented 1 month ago

As for ....

The "100644 != 0444" would have been reported as "0644 != 0444" or 100644 != 100444".

The mode is simply the struct stat st_mode. So perhaps I need to mask out something or add to the message the number? But on that part the problem is knowing the right mask to

QUESTION

Since it's the st_mode how do you wish to be addressed ? I do have a test on this .. hold please.

Most people would probably know permission of the form "0444" and might not know the extended mode bits. So we recommend:

"0644 != 0444"

Oh I agree. This is really weird though. It uses the function filemode(). Okay but that is used how? Like this...

werr(1, __func__, "%s: permissions %o != 0444", u, filemode(u));/*ooo*/

But what does filemode() return?

It returns...

buf.st_mode;

where but is a struct stat!

Did I mess up the printing somehow without thinking about it ? And yet I think not as a test like:

   mode_t mode = S_IRUSR | S_IRGRP | S_IROTH;

   printf("%o\n", mode);

shows 444!

Not sure about the prefixing it though: since in some cases it's a 1 not a 0.

If I had to guess the reason is that some file systems have an additional bit. I'm doing a test on an actual file.

xexyl commented 1 month ago

Yup. That's it. The st_mode on an actual file does this. I have one thought. Testing ..

xexyl commented 1 month ago

Okay that does work. So the filemode() function needs a modification. Give me a little bit of time.

Once that is done I can have the winning mode (at least) ignore the check of the sane_relative_path and then also check into the directory change error.

FYI: in some file systems at least one has to do (based on file type): & ~S_IFREG.

So I have to in the filemode() function use type_of_file() and mask out the correct bit.

xexyl commented 1 month ago

Fixed the filemdode() function. Will commit shortly and sync. Then it's only a matter of updating the error messages (omit the 0 prefix) and then I can look at the other issues.

lcn2 commented 1 month ago

Did I mess up the printing somehow without thinking about it ? And yet I think not as a test like:

mode_t mode = S_IRUSR | S_IRGRP | S_IROTH;

printf("%o\n", mode); shows 444!

...

FYI: in some file systems at least one has to do (based on file type): & ~S_IFREG.

Use:

mode_t mode = buf.st_mode & ~S_IFMT;

That was you catch permissions such as 1xxx or 2xxx or 4xxx or worse.

UPDATE 0a

Print as 4 octal values such as "0444".

Corrected the expression to use S_IFMT which is 0170000 / type of file /.

See stat(2).

xexyl commented 1 month ago

Did I mess up the printing somehow without thinking about it ? And yet I think not as a test like: mode_t mode = S_IRUSR | S_IRGRP | S_IROTH; printf("%o\n", mode); shows 444!

...

FYI: in some file systems at least one has to do (based on file type): & ~S_IFREG.

Use:

mode_t mode = buf.st_mode & ~0170000;

That was you catch permissions such as 1xxx or 2xxx or 4xxx or worse.

UPDATE 0

Print as 4 octal values such as "0444".

Well I use the macro (for regular files) st_mode &= ~S_IFREG.

However that will print as 444 if no special bits. The problem is what if there are special bits? The prefix of 0 might be wrong.

Obviously I can't rely on the bits themselves as the file system might be different.

lcn2 commented 1 month ago

see the correction

xexyl commented 1 month ago

FYI:

bits/stat.h:#define __S_IFREG   0100000 /* Regular file.  */

But the other one cannot be done as it makes assumptions. So the display will be without the 0.

xexyl commented 1 month ago

see the correction

Ah. Does that work for all file types? It's not a macro I'm familiar with.

lcn2 commented 1 month ago

See stat(2).

The S_IFREG will fail to mask lots of other types of "not a file" things.

xexyl commented 1 month ago

see the correction

Ah. Does that work for all file types? It's not a macro I'm familiar with.

And if it is will it work with the not having to check the file type? I.e. will I not need to do something like...

    if (is_file(path)) {
        mode |= S_IFREG;
    } else if (is_dir(path)) {
        mode |= S_IFDIR;

etc.?

Incidentally that function was written before the function type_of_file() so I will update it to not have to repeatedly call stat(2)!

lcn2 commented 1 month ago

For printing "permissions" print as %04o the value mode & ~S_IFMT.

See stat(2).

xexyl commented 1 month ago

I see the updated comment. Interesting. So I guess it depends on the file type too (as expected). I can update the code to use that for printing. The question is what if the filemode() is not being used for printing? In that case I am not sure what to do.

I think mostly it is used for printing though but it might not be in every case.

xexyl commented 1 month ago

For printing "permissions" print as %04o the value mode & ~S_IFMT.

See stat(2).

But obviously I have to use the right macro which means I might need a function that can print the right value. But of course that's problematic as I can't return a static char * and having to free() it puts a big burden on the caller.

So I have to test to see if the macros will work for testing too.

lcn2 commented 1 month ago

For printing "permissions" print as %04o the value mode & ~S_IFMT. See stat(2).

But obviously I have to use the right macro which means I might need a function that can print the right value. But of course that's problematic as I can't return a static char * and having to free() it puts a big burden on the caller.

So I have to test to see if the macros will work for testing too.

File test: (mode & S_IFMT) == S_IFREG

Directory test: (mode & S_IFMT) == S_IFDIR

Print just the permission bits (mode & ~ S_IFMT).

xexyl commented 1 month ago

Okay I see that is indeed a problem. For the printing of the mode I have to do something special. I guess I could do instead a function that gets the st_mode and then returns (say for regular files):

return st_mode & ~S_IFMT;
#define S_ISLNK(m)  (((m) & S_IFMT) == S_IFLNK)

Which means that filemode() should have a boolean that says whether to have the print value OR the st_mode itself. Will do that.

lcn2 commented 1 month ago

We think you have go it. Really going to sleep now.

xexyl commented 1 month ago

For printing "permissions" print as %04o the value mode & ~S_IFMT. See stat(2).

But obviously I have to use the right macro which means I might need a function that can print the right value. But of course that's problematic as I can't return a static char * and having to free() it puts a big burden on the caller. So I have to test to see if the macros will work for testing too.

File test: (mode & S_IFMT) == S_IFREG

Directory test: (mode & S_IFMT) == S_IFDIR

Print just the permission bits (mode & ~ S_IFMT).

Yup I saw that. I have a way to do it easily. The user just has to use the right boolean and the right format specifier. That should work well. This will mean an update to the API but it'll be worth it. This way no need for yet another function.

xexyl commented 1 month ago

We think you have go it. Really going to sleep now.

Please do! Sleep well my good friend! I'll have some fixes in before you're back and maybe all. Best wishes and sleep tight!

xexyl commented 1 month ago

For printing "permissions" print as %04o the value mode & ~S_IFMT. See stat(2).

But obviously I have to use the right macro which means I might need a function that can print the right value. But of course that's problematic as I can't return a static char * and having to free() it puts a big burden on the caller. So I have to test to see if the macros will work for testing too.

File test: (mode & S_IFMT) == S_IFREG

Directory test: (mode & S_IFMT) == S_IFDIR

Print just the permission bits (mode & ~ S_IFMT).

Yup although there is a macro for the testing types - and I use those to be cleaner. As for the others I just indeed noticed that it is simpler - no need to have different macros per type as it's universally masking out the correct bits.

Thanks for the tip! I did not know of that macro nor did I even think of it - even when I saw the odd values.

xexyl commented 1 month ago

See stat(2).

The S_IFREG will fail to mask lots of other types of "not a file" things.

Yup. I handle that with the IS_ macros in the function file_type() (renamed from type_of_file() as enums are in a different namespace - just remembered that a bit ago).

xexyl commented 1 month ago

The file permissions printing has been fixed in jparse .. now to sync (done) and update code in mkiocccentry tools.

xexyl commented 1 month ago

Just fixed another bug in read_fts()!

If the directory count is 1 we have to check it as if it is a basename even if basename is not true. Why? Because it can only be a basename!

It works in mkiocccentry chkentry. Will commit to jparse after updating CHANGES.md and then sync again and then it's only one (I think) more error - which I think I know what to do about it (the can't find the path).

xexyl commented 1 month ago

I have located the problem with the chdir bug ... have to fix now.

xexyl commented 1 month ago

Now for an option to disable permissions check. Will do that, test, update man page and t hen commit.

xexyl commented 1 month ago

All bugs fixed! Now to update the man page ... hold please.

xexyl commented 1 month ago

Everything fixed in commit https://github.com/ioccc-src/mkiocccentry/pull/1202/commits/2bd1cca67a20120b59cda7cffe92a1f4ad994f0b!

I have to go afk now. Back later and hopefully I can work on checking out the rules and also the guidelines. If nothing else I will check out the rules but I think there's a good chance I'll check the guidelines. Hopefully I can finish them although they are a bit longer than the rules.

Today I am going to an ice cream shop so that takes a bit of time but I would already be done with working on the repos so that's okay.

I will certainly be done in time for the contest to open!

I hope you slept well and I'm glad you finally did go to sleep! Actually I thought you had woken up early but being up all night is even worse.

xexyl commented 1 month ago

And great catch about the dot files! I should go through the ignored files and directory names and add those to the ignore list - at least in winner mode. But as for the other dot files - I am not sure. Should winner mode disable dot file checks ? I think that might be a good idea. What do you think? Should: -p (or if you prefer -P) disable permission checks? And should -w automatically disable dot file checks? I think the latter should be the case and as for -p or -P maybe that should also be disabled for -w? Since after all the judges might want to change the permissions for winning entries. Thus maybe the permission checks and the dot file checks should only be there for the non -w mode. As for the change of directory I'll test that out now ...

The permission checks can be slightly annoying, so a -p might be a good flag to add to chkentry(1) .. however that is enhancement and not really a bug that needs to be fixed.

Better to require someone to chmod the files before running chkentry(1) again .. what is what we ended up doing. :-)

Sure but it was a very quick fix so I added it. Though I made it -P instead of -p. That was a stylistic choice of course. If you wish at some point for it to be -p you can either do it OR you can ask me to do it and I'll be happy to do so - whether or not that is post IOCCC28 or not.

Back later on.

xexyl commented 1 month ago

I'm back .. though trying to focus my eyes. I'll get to rules soon I hope.

QUESTION (somewhat stylistic related for later on but still might be good to know)

I just realised I have some redundant ()s in the calls to append_path(). Oh well. That's not important even if it bugs me. It can come after IOCCC28. I might fix it in jparse/util.c (at the repo) though. Although thinking on it it might be better to have them to show clearly that it's a pointer to a pointer.

If you get a chance what do you think about this @lcn2?

Right now I have:

append_path(&(fts->ignore), abbrevs[i][1], true, false, false);

But this would also work just as well:

append_path(&fts->ignore, abbrevs[i][1], true, false, false);

For future reference what do you prefer ? On the one hand the latter will work just fine but perhaps the first is more clear that fts->ignore is a pointer (in this case a dyn_array) but we need a pointer to that pointer? I think that might actually be the case in which case :-) it would be better to keep the parentheses. Well again not critical but it would be nice to have an answer after we're done with getting ready for the contest.

I'm going to take a bit longer and then I'm going to get working on checking the rules updates from yesterday. Then if all is good I can look at the guidelines! I have to first (just remembered) check the other thread - the one about registering.

lcn2 commented 1 month ago

Right now I have:

append_path(&(fts->ignore), abbrevs[i][1], true, false, false);

We prefer the above.

xexyl commented 1 month ago

Right now I have: append_path(&(fts->ignore), abbrevs[i][1], true, false, false);

We prefer the above.

I guessed that .. more clear. After all it's not a submission to the IOCCC!

xexyl commented 1 month ago

I put in a joke but it's too late for it so I'll share here:

    { "try"         , TRY_SH                }, /* try not to keep sorted */
    { "try.alt"     , TRY_ALT_SH            }, /* alternatively, try and keep sorted */
xexyl commented 1 month ago

Did my commit solve the problem btw ?

lcn2 commented 1 month ago

Did my commit solve the problem btw ?

With:

chkentry version: 2.0.1 2025-03-02
jparse utils version: 2.0.2 2025-03-02
jparse UTF-8 version: 2.1.0 2025-02-28
jparse library version: 2.3.0 2025-02-28

This works:

$ ./chkentry -i .entry.json -i README.md -i index.html -i .auth.json -i .info.json -i remarks.md -i .gitignore -i .path ../docroot/winner/2014/birken

As does this:

$ ./chkentry -w ../docroot/winner/2014/birken

👍🎉

xexyl commented 1 month ago

I put in a joke but it's too late for it so I'll share here:

{ "try"         , TRY_SH                }, /* try not to keep sorted */
{ "try.alt"     , TRY_ALT_SH            }, /* alternatively, try and keep sorted */

I ended up committing this with the txzchk fixes .. so I'm happy about that. And glad you got a laugh (knew you would).

xexyl commented 1 month ago

Did my commit solve the problem btw ?

With:

chkentry version: 2.0.1 2025-03-02
jparse utils version: 2.0.2 2025-03-02
jparse UTF-8 version: 2.1.0 2025-02-28
jparse library version: 2.3.0 2025-02-28

This works:

$ ./chkentry -i .entry.json -i README.md -i index.html -i .auth.json -i .info.json -i remarks.md -i .gitignore -i .path ../docroot/winner/2014/birken As does this:

$ ./chkentry -w ../docroot/winner/2014/birken 👍🎉

Great! The problem for the directory change: missing calls to the special feature of read_fts() which will restore the previous directory. Technically can do the fchdir(2) but that means we have to manually close it and check for errors - and since it's part of the read_fts() function as a general helper (as the function is itself) it seemed better to use it. If you haven't looked at read_fts() before or the functions find_path() and find_paths() you should when you get a chance. It's a very nice wrapper to the FTS functions and allows for a lot of things that would have to be done manually otherwise.

As for the permissions you know the issue.

As for the display problem I just took a look at that macro and as you noted (and it makes perfect sense looking at it) it's all that's needed - after all, we just have to mask out the bits we don't need.

And yeah of course the %04o too. Thus a new option to filemode(). I was happy to change type_of_file() to file_type() also.