Enhancement: create the jauthchk tool - check on the contents of an .author.json file

lcn2 commented 2 years ago

We need to create the jauthchk tool in order to help verify that contents of an .author.json file found within an entry directory.

This tool will primarily be used by other tools (not humans). As such it should behave like fnamchk in that if all is well, it should not print anything and simply exit 0. If there are problems found with the .author.json file, then warning messages should be printed to stderr AND the jauthchk tool should exit with a non-zero status. The use of a -v level may be use to assist in debugging.

The jauthchk tool is primarily a stand alone tool. As a sanity check, the mkiocccentry program should execute the jauthchk code AFTER .author.json file has been created and before the compressed tarball is formed. If mkiocccentry program sees a 0 exit status, then all is well. For a non-zero exit code, the tool probably should abort because any problems detected by jauthchk based on what mkiocccentry wrote into .author.json indicates there is a serious mismatch between what mkiocccentry is doing and what jauthchk expects.

The following might be how mkiocccentry output is changed with the use of this tool (and the other tool):

Is the above list a correct list of files in your entry? [yn]: y

Checking the format of .info.json ...
... all appears well with the .info.json file.

Checking the format of .author.json ...
... all appears well with the .author.json file.

About to run the tar command to form the compressed tarball ...

As a stand alone tool, the jauthchk tool will be invoked by other tools as part of the IOCCC submission process. That process is beyond the scope of this repo. Suffice it to sat the the IOCCC judges will use this tool is part of their submission workflow.

Here is a possible command line usage message:

jauthchk [-h] [-v level] [-V] file

    -h          print help message and exit 0
    -v level        set verbosity level: (def level: 0)
    -V          print version string and exit

        file                    path to a .author.json file

exit codes:

        0                       no errors or warnings detected
        >0                      some error(s) and/or warning(s) were detected

NOTE: We mention file above even though the canonical filename will be .author.json. The tool should NOT check, nor object to using a different filename.

The mkiocccentry tool will need to invoke this tool. As such a similar method used to find and specify the location of txzchk should be used. As this tool is one of 2 tools being considered, we recommend the following of added to the mkiocccentry command line:

    -j /path/to/jinfochk    path to jinfochk executable used by txzchk (def: ./jinfochk)
    -J /path/to/jauthchk    path to jauthchk executable used by txzchk (def: ./jauthchk)

IMPORTANT: While it might be tempting to consider depending on some general JSON checker, we do NOT need nor want that. It is important that the mkiocccentry GitHub repo remain stand alone. I.e., all the code needed by someone wishing to enter the IOCCC (beside a C compiler, make, tar, cp, ls) should found in this GitHub repo alone. As there is NO standard JSON tool in widespread distribution the all of the code for this tool needs to reside in this repo only.

IMPORTANT: We do not need a general JSON format checker. We only need to verify that the file contains the JSON needed and only the JSON needed for the judges to process IOCCC entries.

While is it NOT recommended, if someone wishes to edit their .author.json and re-create the compressed tarball we cannot stop them. As such mkiocccentry should be STRICT on what is writes into .author.json AND jauthchk should be permissive (but not to a fault) in what is considers as OK.

This tool should neither generate an error, nor warn if someone were to reformat the JSON. And as JSON is not order dependent, of someone wishes to reorder the JSON elements, that is fine. As long as all the requirement JSON elements are present, and no new JSON elements are found, and the version string matches, all is OK.

Should something go wrong and a change to the JSON is required during an open IOCCC, the judges will preserve the older JSON check tools and use those against older JSON formats. This there is no need for a >= version check: a version string match seems good enough.

See the a followup comment for details on the checks needed against an .author.json file.

xexyl commented 2 years ago

Do you want me to rename it though so it says author_count? I can do that if that's better. Edit: Just did it since that's how you had it originally and that's the name of the variable anyway.

What you did was fine.

Thanks.

lcn2 commented 2 years ago

A question to ponder: the user of mkiocccentry provides a winner_handle, what should happen to the computed author_handle? Should be be allowed to be different? Should it be changed to match? ``Good questions indeed. I'm afraid my answer right now is: I've no idea! :)

Let's keep them independent for now.

xexyl commented 2 years ago

A question to ponder: the user of mkiocccentry provides a winner_handle, what should happen to the computed author_handle? Should be be allowed to be different? Should it be changed to match? ``Good questions indeed. I'm afraid my answer right now is: I've no idea! :)

Let's keep them independent for now.

You mean it can be different but it doesn't have to be?

lcn2 commented 2 years ago

For now, the function needs to make a reasonable translation of the name in forming a author_handle. I'm going to have to give this quite some thought I think. It'd be better to finish jinfochk and jauthchk first, right?

Yes. OK to finish the tools are the are now. Then a modification to mkiocccentry to form the author_handle string can be done .. along with the mods for jauthchk and test cases.

If you want to put in a placeholder author_handle entry into the authors array of .info.json now so that you have code to check it, that is fine. You can put "XXX9", where 9 is the author number, and "XXX" is "XXX".

The author_handle values in the array MUST be unique.

XXX is a common - take care of this later, it is important - notation that goes back to the dates of 2.xBSD. It is why some editors, such as vim(1) make the XXX string stand out.

That was you can defer the writing of the name to translation function and tool, until after you finish the current tools.

lcn2 commented 2 years ago

When the last name is too long, prefer leading characters of the last name In other words truncate the name?

It might be, but lets refer how it it gets done until after these current pull requests for jinfochk and jauthchk are completed.

We can let this stew for now.

xexyl commented 2 years ago

For now, the function needs to make a reasonable translation of the name in forming a author_handle. I'm going to have to give this quite some thought I think. It'd be better to finish jinfochk and jauthchk first, right?

Yes. OK to finish the tools are the are now. Then a modification to mkiocccentry to form the author_handle string can be done .. along with the mods for jauthchk and test cases.

Thanks.

If you want to put in a placeholder author_handle entry into the authors array of .info.json now so that you have code to check it, that is fine. You can put "XXX9", where 9 is the author number, and "XXX" is "XXX".

That might be something to consider.

The author_handle values in the array MUST be unique.

Yes though of course I first have to have it parse arrays.

XXX is a common - take care of this later, it is important - notation that goes back to the dates of 2.xBSD. It is why some editors, such as vim(1) make the XXX string stand out.

I actually have in my .vimrc highlighting of XXX, TODO and FIXME but I did not know the origin. Thanks.

That was you can defer the writing of the name to translation function and tool, until after you finish the current tools.

Thanks.

xexyl commented 2 years ago

When the last name is too long, prefer leading characters of the last name In other words truncate the name?

It might be, but lets refer how it it gets done until after these current pull requests for jinfochk and jauthchk are completed.

We can let this stew for now.

That sounds reasonable. Perhaps when writing more of these two tools more thoughts will come up that will help with this tool.

lcn2 commented 2 years ago

A question to ponder: the user of mkiocccentry provides a winner_handle, what should happen to the computed author_handle? Should be be allowed to be different? Should it be changed to match?

``Good questions indeed. I'm afraid my answer right now is: I've no idea! :)

Let's keep them independent for now.

You mean it can be different but it doesn't have to be?

As we don't know yet how this should be handled, they can be independent for now.

lcn2 commented 2 years ago

FYI: Major changes were make to iocccsize as part of a pull request for Anthony Howe's original repo that we forked (into a private repo).

While Anthony Howe's iocccsize is not an official code, the changes we made allows us to use copies of his source code (assuming he accepts our pull request) without modification. In particular, these files:

iocccsize-test.sh
iocccsize.c
iocccsize.h
iocccsize_err.h
rule_count.c

We should attempt. if reasonable, to not modify those files. However if we must modify them, the Landon Noll needs make another Pull request of those mods from his private forked iocccsize-unofficial repo into Anthony Howe's iocccsize repo.

xexyl commented 2 years ago

FYI: Major changes were make to iocccsize as part of a pull request for Anthony Howe's original repo that we forked (into a private repo).

While Anthony Howe's iocccsize is not an official code, the changes we made allows us to use copies of his source code (assuming he accepts our pull request) without modification. In particular, these files:

iocccsize-test.sh

iocccsize.c

iocccsize.h

iocccsize_err.h

rule_count.c

We should attempt. if reasonable, to not modify those files. However if we must modify them, the Landon Noll needs make another Pull request of those mods from his private forked iocccsize-unofficial repo into Anthony Howe's iocccsize repo.

Yes I see that and unfortunately it conflicts with some things I did - not those files but I moved all the version macros to a new file version.h and the Makefile now has all the versions exported to limit_ioccc.sh and I made some bug fixes as well.

lcn2 commented 2 years ago

Yes I see that and unfortunately it conflicts with some things I did - not those files but I moved all the version macros to a new file version.h and the Makefile now has all the versions exported to limit_ioccc.sh and I made some bug fixes as well.

Sorry! And we like your idea for version.h.

xexyl commented 2 years ago

Yes I see that and unfortunately it conflicts with some things I did - not those files but I moved all the version macros to a new file version.h and the Makefile now has all the versions exported to limit_ioccc.sh and I made some bug fixes as well.

Sorry! And we like your idea for version.h.

No worries. I'm working on it now. I'm hoping before the hour I can get it done but I expect not since I think I will end up deleting my fork etc. as I mentioned a few minutes ago in the other thread.

xexyl commented 2 years ago

Uh oh. I see a make test error with the changes to iocccsize:

RUNNING: iocccsize-test.sh
./iocccsize-test.sh -v
-OK- test-iocccsize/crlf.c: 2 8 1
-OK- test-iocccsize/utf8.c: 12 21 1
-OK- test-iocccsize/splitline0.c: 19 36 1
-OK- test-iocccsize/comment0.c: 44 58 1
-OK- test-iocccsize/comment1.c: 46 61 1
-OK- test-iocccsize/comment2.c: 27 37 1
-OK- test-iocccsize/comment3.c: 6 7 0
-OK- test-iocccsize/comment4.c: 14 19 0
-OK- test-iocccsize/comment5.c: 17 21 0
-OK- test-iocccsize/comment6.c: 30 42 1
-OK- test-iocccsize/comment7.c: 37 49 0
-OK- test-iocccsize/quote0.c: 16 24 1
-OK- test-iocccsize/quote1.c: 12 20 1
-OK- test-iocccsize/quote2.c: 14 22 1
FAIL test-iocccsize/digraph.c: got 16 24 1 != expect 14 24 1
FAIL test-iocccsize/trigraph0.c: got 18 26 1 != expect 14 26 1
FAIL test-iocccsize/trigraph1.c: got 50 64 1 != expect 49 64 0
FAIL test-iocccsize/trigraph2.c: got 18 24 0 != expect 12 24 0
FAIL test-iocccsize/trigraph3.c: got 22 38 1 != expect 19 38 1
-OK- test-iocccsize/main0.c: 22 47 4
-OK- test-iocccsize/hello.c: 58 101 6
FAIL test-iocccsize/hello_digraph.c: got 70 108 5 != expect 60 108 6
FAIL test-iocccsize/hello_trigraph.c: got 73 111 5 != expect 58 111 6
-OK- test-iocccsize/include0.c: 10 21 1
-OK- test-iocccsize/include1.c: 26 46 2
-OK- test-iocccsize/curly0.c: 12 25 1
-OK- test-iocccsize/curly1.c: 119 192 6
-OK- test-iocccsize/curly2.c: 113 196 6
-OK- test-iocccsize/semicolon0.c: 10 22 1
-OK- test-iocccsize/semicolon1.c: 65 133 8
-OK- test-iocccsize/semicolon2.c: 67 127 8
make: *** [test] Error 1

I hope that my changes will not conflict with the resolution. Maybe you can hold off for a bit longer - until I have done a pull request?

xexyl commented 2 years ago

Okay I'm looking at a diff to see that I got everything. If I did I will copy the files, delete the fork and recreate it and then do the commit. I'll leave off the commit that fixes the formed_UTC - or rather adds the check; the format is fixed so all I have to do is add the call to strptime(3). Update soon! I don't know about other things the rest of the day today; it was a difficult night. I want to work more on it though which is a good sign.

lcn2 commented 2 years ago

Uh oh. I see a make test error with the changes to iocccsize:

RUNNING: iocccsize-test.sh
./iocccsize-test.sh -v
-OK- test-iocccsize/crlf.c: 2 8 1
-OK- test-iocccsize/utf8.c: 12 21 1
-OK- test-iocccsize/splitline0.c: 19 36 1
-OK- test-iocccsize/comment0.c: 44 58 1
-OK- test-iocccsize/comment1.c: 46 61 1
-OK- test-iocccsize/comment2.c: 27 37 1
-OK- test-iocccsize/comment3.c: 6 7 0
-OK- test-iocccsize/comment4.c: 14 19 0
-OK- test-iocccsize/comment5.c: 17 21 0
-OK- test-iocccsize/comment6.c: 30 42 1
-OK- test-iocccsize/comment7.c: 37 49 0
-OK- test-iocccsize/quote0.c: 16 24 1
-OK- test-iocccsize/quote1.c: 12 20 1
-OK- test-iocccsize/quote2.c: 14 22 1
FAIL test-iocccsize/digraph.c: got 16 24 1 != expect 14 24 1
FAIL test-iocccsize/trigraph0.c: got 18 26 1 != expect 14 26 1
FAIL test-iocccsize/trigraph1.c: got 50 64 1 != expect 49 64 0
FAIL test-iocccsize/trigraph2.c: got 18 24 0 != expect 12 24 0
FAIL test-iocccsize/trigraph3.c: got 22 38 1 != expect 19 38 1
-OK- test-iocccsize/main0.c: 22 47 4
-OK- test-iocccsize/hello.c: 58 101 6
FAIL test-iocccsize/hello_digraph.c: got 70 108 5 != expect 60 108 6
FAIL test-iocccsize/hello_trigraph.c: got 73 111 5 != expect 58 111 6
-OK- test-iocccsize/include0.c: 10 21 1
-OK- test-iocccsize/include1.c: 26 46 2
-OK- test-iocccsize/curly0.c: 12 25 1
-OK- test-iocccsize/curly1.c: 119 192 6
-OK- test-iocccsize/curly2.c: 113 196 6
-OK- test-iocccsize/semicolon0.c: 10 22 1
-OK- test-iocccsize/semicolon1.c: 65 133 8
-OK- test-iocccsize/semicolon2.c: 67 127 8
make: *** [test] Error 1

I hope that my changes will not conflict with the resolution. Maybe you can hold off for a bit longer - until I have done a pull request?

Sorry, we did not see this last part until just now. Hopefully we didn't mess up things on your end with our previous changes.

We have a number of changes to *.sh scripts as per the shellcheck tool, however we will wait until after your next commit.

lcn2 commented 2 years ago

Uh oh. I see a make test error with the changes to iocccsize:

RUNNING: iocccsize-test.sh
./iocccsize-test.sh -v
-OK- test-iocccsize/crlf.c: 2 8 1
-OK- test-iocccsize/utf8.c: 12 21 1
-OK- test-iocccsize/splitline0.c: 19 36 1
-OK- test-iocccsize/comment0.c: 44 58 1
-OK- test-iocccsize/comment1.c: 46 61 1
-OK- test-iocccsize/comment2.c: 27 37 1
-OK- test-iocccsize/comment3.c: 6 7 0
-OK- test-iocccsize/comment4.c: 14 19 0
-OK- test-iocccsize/comment5.c: 17 21 0
-OK- test-iocccsize/comment6.c: 30 42 1
-OK- test-iocccsize/comment7.c: 37 49 0
-OK- test-iocccsize/quote0.c: 16 24 1
-OK- test-iocccsize/quote1.c: 12 20 1
-OK- test-iocccsize/quote2.c: 14 22 1
FAIL test-iocccsize/digraph.c: got 16 24 1 != expect 14 24 1
FAIL test-iocccsize/trigraph0.c: got 18 26 1 != expect 14 26 1
FAIL test-iocccsize/trigraph1.c: got 50 64 1 != expect 49 64 0
FAIL test-iocccsize/trigraph2.c: got 18 24 0 != expect 12 24 0
FAIL test-iocccsize/trigraph3.c: got 22 38 1 != expect 19 38 1
-OK- test-iocccsize/main0.c: 22 47 4
-OK- test-iocccsize/hello.c: 58 101 6
FAIL test-iocccsize/hello_digraph.c: got 70 108 5 != expect 60 108 6
FAIL test-iocccsize/hello_trigraph.c: got 73 111 5 != expect 58 111 6
-OK- test-iocccsize/include0.c: 10 21 1
-OK- test-iocccsize/include1.c: 26 46 2
-OK- test-iocccsize/curly0.c: 12 25 1
-OK- test-iocccsize/curly1.c: 119 192 6
-OK- test-iocccsize/curly2.c: 113 196 6
-OK- test-iocccsize/semicolon0.c: 10 22 1
-OK- test-iocccsize/semicolon1.c: 65 133 8
-OK- test-iocccsize/semicolon2.c: 67 127 8
make: *** [test] Error 1

I hope that my changes will not conflict with the resolution. Maybe you can hold off for a bit longer - until I have done a pull request?

If it helps, we notice that the FAIL lines report ioccsize -v 1 result that is consistent withDIGRAPHSandTRIGRAPHS` being enabled.

The limit_ioccc.sh file, sourced by the iocccsize-test.sh script, should have picked on the define or lack of define in limit_ioccc.h for the make limit_ioccc.sh rule.

lcn2 commented 2 years ago

BTW: The use of -DIOCCCSIZE_STANDALONE in the Makefile for building iocccsize is wrong. It should be -DMKIOCCCENTRY_USE.

xexyl commented 2 years ago

No. It’s fine. I have no working changes. That was from yesterday and all is taken care of.

Thank you though! More from me tomorrow. Hack away!

I will react to the other stuff later.

On Feb 21, 2022, at 13:51, Landon Curt Noll @.***> wrote:

Uh oh. I see a make test error with the changes to iocccsize:

RUNNING: iocccsize-test.sh ./iocccsize-test.sh -v -OK- test-iocccsize/crlf.c: 2 8 1 -OK- test-iocccsize/utf8.c: 12 21 1 -OK- test-iocccsize/splitline0.c: 19 36 1 -OK- test-iocccsize/comment0.c: 44 58 1 -OK- test-iocccsize/comment1.c: 46 61 1 -OK- test-iocccsize/comment2.c: 27 37 1 -OK- test-iocccsize/comment3.c: 6 7 0 -OK- test-iocccsize/comment4.c: 14 19 0 -OK- test-iocccsize/comment5.c: 17 21 0 -OK- test-iocccsize/comment6.c: 30 42 1 -OK- test-iocccsize/comment7.c: 37 49 0 -OK- test-iocccsize/quote0.c: 16 24 1 -OK- test-iocccsize/quote1.c: 12 20 1 -OK- test-iocccsize/quote2.c: 14 22 1 FAIL test-iocccsize/digraph.c: got 16 24 1 != expect 14 24 1 FAIL test-iocccsize/trigraph0.c: got 18 26 1 != expect 14 26 1 FAIL test-iocccsize/trigraph1.c: got 50 64 1 != expect 49 64 0 FAIL test-iocccsize/trigraph2.c: got 18 24 0 != expect 12 24 0 FAIL test-iocccsize/trigraph3.c: got 22 38 1 != expect 19 38 1 -OK- test-iocccsize/main0.c: 22 47 4 -OK- test-iocccsize/hello.c: 58 101 6 FAIL test-iocccsize/hello_digraph.c: got 70 108 5 != expect 60 108 6 FAIL test-iocccsize/hello_trigraph.c: got 73 111 5 != expect 58 111 6 -OK- test-iocccsize/include0.c: 10 21 1 -OK- test-iocccsize/include1.c: 26 46 2 -OK- test-iocccsize/curly0.c: 12 25 1 -OK- test-iocccsize/curly1.c: 119 192 6 -OK- test-iocccsize/curly2.c: 113 196 6 -OK- test-iocccsize/semicolon0.c: 10 22 1 -OK- test-iocccsize/semicolon1.c: 65 133 8 -OK- test-iocccsize/semicolon2.c: 67 127 8 make: *** [test] Error 1 I hope that my changes will not conflict with the resolution. Maybe you can hold off for a bit longer - until I have done a pull request?

Sorry, we did not see this last part until just now. Hopefully we didn't mess up things on your end with our previous changes.

We have a number of changes to the *.sh as per the shellcheck tool, however we will wait until after your next commit.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.

xexyl commented 2 years ago

Uh oh. I see a make test error with the changes to iocccsize:

RUNNING: iocccsize-test.sh
./iocccsize-test.sh -v
-OK- test-iocccsize/crlf.c: 2 8 1
-OK- test-iocccsize/utf8.c: 12 21 1
-OK- test-iocccsize/splitline0.c: 19 36 1
-OK- test-iocccsize/comment0.c: 44 58 1
-OK- test-iocccsize/comment1.c: 46 61 1
-OK- test-iocccsize/comment2.c: 27 37 1
-OK- test-iocccsize/comment3.c: 6 7 0
-OK- test-iocccsize/comment4.c: 14 19 0
-OK- test-iocccsize/comment5.c: 17 21 0
-OK- test-iocccsize/comment6.c: 30 42 1
-OK- test-iocccsize/comment7.c: 37 49 0
-OK- test-iocccsize/quote0.c: 16 24 1
-OK- test-iocccsize/quote1.c: 12 20 1
-OK- test-iocccsize/quote2.c: 14 22 1
FAIL test-iocccsize/digraph.c: got 16 24 1 != expect 14 24 1
FAIL test-iocccsize/trigraph0.c: got 18 26 1 != expect 14 26 1
FAIL test-iocccsize/trigraph1.c: got 50 64 1 != expect 49 64 0
FAIL test-iocccsize/trigraph2.c: got 18 24 0 != expect 12 24 0
FAIL test-iocccsize/trigraph3.c: got 22 38 1 != expect 19 38 1
-OK- test-iocccsize/main0.c: 22 47 4
-OK- test-iocccsize/hello.c: 58 101 6
FAIL test-iocccsize/hello_digraph.c: got 70 108 5 != expect 60 108 6
FAIL test-iocccsize/hello_trigraph.c: got 73 111 5 != expect 58 111 6
-OK- test-iocccsize/include0.c: 10 21 1
-OK- test-iocccsize/include1.c: 26 46 2
-OK- test-iocccsize/curly0.c: 12 25 1
-OK- test-iocccsize/curly1.c: 119 192 6
-OK- test-iocccsize/curly2.c: 113 196 6
-OK- test-iocccsize/semicolon0.c: 10 22 1
-OK- test-iocccsize/semicolon1.c: 65 133 8
-OK- test-iocccsize/semicolon2.c: 67 127 8
make: *** [test] Error 1

I hope that my changes will not conflict with the resolution. Maybe you can hold off for a bit longer - until I have done a pull request?

If it helps, we notice that the FAIL lines report ioccsize -v 1 result that is consistent withDIGRAPHSandTRIGRAPHS` being enabled.

The limit_ioccc.sh file, sourced by the iocccsize-test.sh script, should have picked on the define or lack of define in limit_ioccc.h for the make limit_ioccc.sh rule.

Yes. I came to the same conclusion in the other thread. Perhaps you fixed it already? Well I will find out tomorrow.

xexyl commented 2 years ago

BTW: The use of -DIOCCCSIZE_STANDALONE in the Makefile for building iocccsize is wrong. It should be -DMKIOCCCENTRY_USE.

I don’t remember adding that but either way I guess you have fixed it. If you don’t get to it I will do it in the morning.

Well I think I actually ended up replying to everything after all but if not I will tomorrow. Really going now. Good night!

lcn2 commented 2 years ago

With commit eff67fd20adf97054c1d87a93bc4998c70ec1158 we may have fixed it. Certainly we have improved that script.

If all is well, we may submit a followup pull request to @ SirWumpus iocccsize repo for his consideration.

What do you think of the test now, @xexyl?

xexyl commented 2 years ago

With commit eff67fd we may have fixed it. Certainly we have improved that script.

If all is well, we may submit a followup pull request to @ SirWumpus iocccsize repo for his consideration.

What do you think of the test now, @xexyl?

I just saw this. I think that’s a good idea. The script does indeed work again.

lcn2 commented 2 years ago

The pull request from @ SirWumpus iocccsize repo has been updated to reflect recent changes / fixes to iocccsize and iocccsize-test.sh.

xexyl commented 2 years ago

The pull request from @ SirWumpus iocccsize repo has been updated to reflect recent changes / fixes to iocccsize and iocccsize-test.sh.

Thank you!

xexyl commented 2 years ago

I'm not sure if we discussed this or not but it's not in the original requirements so I'm asking.

Should the twitter and GitHub handles - their values I mean - be tested that the first character is a @?

What about URLs? That they start with http:// or https:// (which I guess is after decoding)?

I have a ways to go before I get to this point I think but they crossed my mind.

One other thing: should there be other social media platforms added to the authors?

xexyl commented 2 years ago

One other thing: should there be other social media platforms added to the authors?

I don't mind of course but it might be something that some people would like to share? This might include LinkedIn but maybe others too?

xexyl commented 2 years ago

I'm not sure if we discussed this or not but it's not in the original requirements so I'm asking.

Should the twitter and GitHub handles - their values I mean - be tested that the first character is a @?

What about URLs? That they start with http:// or https:// (which I guess is after decoding)?

Okay and emails too of course should have a @. I looked back and saw that we did in fact discuss this. This made me wonder: what do you suggest is the best way to address the fact that I might have missed some tests that have be done? Should we come up with a file for each tool or should the OP be updated?

Maybe I should just continue working on it and when I think I'm done we can look at it and make sure? I guess that works. Anyway I'm done here for now and probably the day.

lcn2 commented 2 years ago

I'm not sure if we discussed this or not but it's not in the original requirements so I'm asking.

Should the twitter and GitHub handles - their values I mean - be tested that the first character is a @?

Yes. Also no other @ allowed after the first character.

What about URLs? That they start with http:// or https:// (which I guess is after decoding)?

Yes.

I have a ways to go before I get to this point I think but they crossed my mind.

One other thing: should there be other social media platforms added to the authors?

We added twitter because of how/were we announce winners. Knowing a new winner's twitter handle allows us to name them (assuming they provide it).

We added GitHub for reference in regards to future pull requests from the winner. OK, GitHub is not exactly a social media platform .. but you get the idea. :-)

Beyond that: the URL can be a general concept for those with pages on various social media platforms. We don't see a need to add more social media elements ... unless we are somehow missing something obvious?

lcn2 commented 2 years ago

One other thing: should there be other social media platforms added to the authors?

I don't mind of course but it might be something that some people would like to share? This might include LinkedIn but maybe others too?

An alternative approach would be to allow an author to list more than one URL: not sure.

xexyl commented 2 years ago

I'm not sure if we discussed this or not but it's not in the original requirements so I'm asking. Should the twitter and GitHub handles - their values I mean - be tested that the first character is a @?

Yes. Also no other @ allowed after the first character.

I have a vague memory of this now you say it.

What about URLs? That they start with http:// or https:// (which I guess is after decoding)?

Yes.

Right. This of course comes later but good to know.

I have a ways to go before I get to this point I think but they crossed my mind. One other thing: should there be other social media platforms added to the authors?

We added twitter because of how/were we announce winners. Knowing a new winner's twitter handle allows us to name them (assuming they provide it).

Yes I figured that.

We added GitHub for reference in regards to future pull requests from the winner. OK, GitHub is not exactly a social media platform .. but you get the idea. :-)

True. I figured this as well.

Beyond that: the URL can be a general concept for those with pages on various social media platforms. We don't see a need to add more social media elements ... unless we are somehow missing something obvious?

Perhaps: if I wanted to include something in addition to my url. I don't know that I do but that's a possible reason to have it.

xexyl commented 2 years ago

One other thing: should there be other social media platforms added to the authors?

I don't mind of course but it might be something that some people would like to share? This might include LinkedIn but maybe others too?

An alternative approach would be to allow an author to list more than one URL: not sure.

That could work too and it's something I have thought of (though I'm not certain if that's in relation to this thought) though I guess there's a problem with all of these: how far do you go? That applies to every reference even. Still might be an idea to consider more than one URL though how many I don't know.

lcn2 commented 2 years ago

One other thing: should there be other social media platforms added to the authors?

I don't mind of course but it might be something that some people would like to share? This might include LinkedIn but maybe others too?

An alternative approach would be to allow an author to list more than one URL: not sure.

That could work too and it's something I have thought of (though I'm not certain if that's in relation to this thought) though I guess there's a problem with all of these: how far do you go? That applies to every reference even. Still might be an idea to consider more than one URL though how many I don't know.

It may be a good idea to consider multiple URLs as a potential enhancement for later .. otherwise there is a risk of never coming to the end this enhancement :-).

xexyl commented 2 years ago

One other thing: should there be other social media platforms added to the authors?

I don't mind of course but it might be something that some people would like to share? This might include LinkedIn but maybe others too?

An alternative approach would be to allow an author to list more than one URL: not sure.

That could work too and it's something I have thought of (though I'm not certain if that's in relation to this thought) though I guess there's a problem with all of these: how far do you go? That applies to every reference even. Still might be an idea to consider more than one URL though how many I don't know.

It may be a good idea to consider multiple URLs as a potential enhancement for later .. otherwise there is a risk of never coming to the end this enhancement :-).

Oh that's true. jauthchk should be finished of course and I have enough to do already. But since it's about the author I figured I'd bring it up now.

xexyl commented 2 years ago

Before I go to sleep:

I have been thinking about the issue of the author handles off and on for some days and I started to think that maybe there shouldn’t be another tool that tries to create a POSIX portable name. This is why.

The problem is that although I know that in German you can add an E if you don’t have the umlaut i.e. ö -> oe, ü-> ue etc. which means über ‘above’ (and other translations depending on context) can be spelt like ueber I have no idea about other languages. I am far from fluent in German but I have a love for it and I know some German. My interest in it is another story entirely but the point is that I don’t really know how to correctly handle all languages.

I mean what should Spanish ñ become? I haven’t the slightest clue! I guess the only answer would be n but what if they want to change it to something else? If we explain to them the reason they might have a better idea: rather than us making assumptions.

So I am thinking that maybe it’s not a good idea to try and do this. What should be done instead?

I think the best thing that can be done is explain to them the reason for the restrictions and they can decide: after all a German would know the rule I described but other people who have names in other languages would also know the rules for their language.

Perhaps it’s an idea to write some code to figure this out but will it be correct? Probably not which means it probably shouldn’t be done.

What are your thoughts? If you agree with this perhaps you would like to write the explanation? I would be happy to add it to the source code so that you can focus on the website and so it doesn’t have any possible conflicts with any changes I make tomorrow (though I don’t think I will have to modify the mkiocccentry code).

And with that I am going to sleep. Good night!

lcn2 commented 2 years ago

Before I go to sleep:

I have been thinking about the issue of the author handles off and on for some days and I started to think that maybe there shouldn’t be another tool that tries to create a POSIX portable name. This is why.

The problem is that although I know that in German you can add an E if you don’t have the umlaut i.e. ö -> oe, ü-> ue etc. which means über ‘above’ (and other translations depending on context) can be spelt like ueber I have no idea about other languages. I am far from fluent in German but I have a love for it and I know some German. My interest in it is another story entirely but the point is that I don’t really know how to correctly handle all languages.

I mean what should Spanish ñ become? I haven’t the slightest clue! I guess the only answer would be n but what if they want to change it to something else? If we explain to them the reason they might have a better idea: rather than us making assumptions.

So I am thinking that maybe it’s not a good idea to try and do this. What should be done instead?

I think the best thing that can be done is explain to them the reason for the restrictions and they can decide: after all a German would know the rule I described but other people who have names in other languages would also know the rules for their language.

Perhaps it’s an idea to write some code to figure this out but will it be correct? Probably not which means it probably shouldn’t be done.

What are your thoughts? If you agree with this perhaps you would like to write the explanation? I would be happy to add it to the source code so that you can focus on the website and so it doesn’t have any possible conflicts with any changes I make tomorrow (though I don’t think I will have to modify the mkiocccentry code).

You might be making this winner_handle process much harder than it needs to be by trying to focus too much on linguistics.

The winner_handle is for computers, no humans to use. So linguistic things such as ö -> oe may be beyond the scope of what computers need.

It would be nice if the winner_handle looked like a person's name. Because the winner_handle will be used to form a filename in the web site, the winner_handle needs to be POSIX portable filename safe. So OK to map common UTC-8 letters with accent marks into an ASCII character.

We wouldn't say no to ö -> oe, if that is an easy thing to do. We suggest you try not to let such improvements have a significant impact on code complexity and time to write. Perhaps a simple UTF-8 to ASCII string map table may be good enough:

struct utf8_ascii_map {
    char *utf8_str;    /* UTF-8 string - use \x hex as needed */
    char *ascii_str;   /* ASCII good enough replacement for utf8_str */
};

Some util.c function could, given a string, return a malloced default winner_handle value. Such a function would need to have a default for a non-POSIX safe character that was not translated. We recommend that for such non-POSIX safe character that are not translated, that be simply dropped? Of course, one would need to deal with a case where such character dropping resulted in a empty string and replace that empty string with something. And if the translation produced a winner_handle that was too long, use some form of truncation.

For this repo, and mkiocccentry in particular, this may be the way to go.

For someone who is NOT a previous IOCCC winner:

Enter author name: søm`füll nåmé
...
Enter author affiliation, or press return to skip:
Is the author a previous IOCCC winner [yn]: n
Enter an author handle,
    OR press return to accept [somfull_name]:

For someone who is a previous IOCCC winner:

Enter author name: søm`füll nåmé
...
Enter author affiliation, or press return to skip:
Is the author a previous IOCCC winner [yn]: y
Enter author's previous IOCCC winner handle (if known),
    OR press return to accept [somfull_name]:

So mkiocccentry, via some function in util.c (that uses some UTF-8 to ASCII string map table) would compute a potential winner_handle. The user is free to press return and accept the default, or enter a winner handle if they are a previous winner, or enter something else, including a better ASCII translation.

If someone wanted to improve on the UTF-8 to ASCII string map table generated, they could. We would recommend starting off with a basic table, get the code working, and then later improve it.

We would want a boolean flag that would indicate if the default winner_handle was used for each author, that is if sometime just pressed return or entered their own value for winner_handle:

"default_author_handle" : true,
"previous_winner" : false,

Update: It would be a very good idea if winner_handle was just author_handle.

xexyl commented 2 years ago

Before I go to sleep: I have been thinking about the issue of the author handles off and on for some days and I started to think that maybe there shouldn’t be another tool that tries to create a POSIX portable name. This is why. The problem is that although I know that in German you can add an E if you don’t have the umlaut i.e. ö -> oe, ü-> ue etc. which means über ‘above’ (and other translations depending on context) can be spelt like ueber I have no idea about other languages. I am far from fluent in German but I have a love for it and I know some German. My interest in it is another story entirely but the point is that I don’t really know how to correctly handle all languages. I mean what should Spanish ñ become? I haven’t the slightest clue! I guess the only answer would be n but what if they want to change it to something else? If we explain to them the reason they might have a better idea: rather than us making assumptions. So I am thinking that maybe it’s not a good idea to try and do this. What should be done instead? I think the best thing that can be done is explain to them the reason for the restrictions and they can decide: after all a German would know the rule I described but other people who have names in other languages would also know the rules for their language. Perhaps it’s an idea to write some code to figure this out but will it be correct? Probably not which means it probably shouldn’t be done. What are your thoughts? If you agree with this perhaps you would like to write the explanation? I would be happy to add it to the source code so that you can focus on the website and so it doesn’t have any possible conflicts with any changes I make tomorrow (though I don’t think I will have to modify the mkiocccentry code).

You might be making this winner_handle process much harder than it needs to be by trying to focus too much on linguistics.

Maybe but I think possibly you misunderstood.

The winner_handle is for computers, no humans to use. So linguistic things such as ö -> oe may be beyond the scope of what computers need.

That’s what I was trying to say actually: that no matter what is done there’s a chance it won’t be correct so is it worth trying to calculate the correct values? Maybe it would be better to tell them briefly what is needed and let them decide what to enter?

It would be nice if the winner_handle looked like a person's name. Because the winner_handle will be used to form a filename in the web site, the winner_handle needs to be POSIX portable filename safe. So OK to map common UTC-8 letters with accent marks into an ASCII character.

Right. That’s kind of where I was going with it. Since it will be a filename it has to be portable and diacritics are not.

We wouldn't say no to ö -> oe, if that is an easy thing to do. We suggest you try not to let such improvements have a significant impact on code complexity and time to write. Perhaps a simple UTF-8 to ASCII string map table may be good enough:

I think the problem with this - and it’s what I was getting at - is that this rule probably doesn’t apply to every language with umlauts.

struct utf8_ascii_map {
    char *utf8_str;    /* UTF-8 string - use \x hex as needed */
    char *ascii_str;   /* ASCII good enough replacement for utf8_str */
};

I could look into this but I would have to actually look into it because I have never played with UTF8 (or any of the others). I have had to use iconv to get xml parsing working but that was a quick command and that was that.

It would be good to learn more about this though so maybe after the other tools are done I can look into this.

Some util.c function could, given a string, return a malloced default winner_handle value. Such a function would need to have a default for a non-POSIX safe character that was not translated. We recommend that for such non-POSIX safe character that are not translated, that be simply dropped? Of course, one would need to deal with a case where such character dropping resulted in a empty string and replace that empty string with something. And if the translation produced a winner_handle that was too long, use some form of truncation.

That’s a possibility. It might be okay to simply drop the invalid characters. But of course as you say if they’re all invalid then what?

This is why I was thinking it might be an idea to tell them the details and let them input it as they would know what is best for their name? I mean it’s likely that they have had this problem before.

For this repo, and mkiocccentry in particular, this may be the way to go.

For someone who is NOT a previous IOCCC winner:
Enter author name: søm`füll nåmé
...
Enter author affiliation, or press return to skip:
Is the author a previous IOCCC winner [yn]: n
Enter an author handle,
    OR press return to accept [somfull_name]: 
For someone who is a previous IOCCC winner:
Enter author name: søm`füll nåmé
...
Enter author affiliation, or press return to skip:
Is the author a previous IOCCC winner [yn]: y
Enter author's previous IOCCC winner handle (if known),
    OR press return to accept [somfull_name]: 
So mkiocccentry, via some function in util.c (that uses some UTF-8 to ASCII string map table) would compute a potential winner_handle. The user is free to press return and accept the default, or enter a winner handle if they are a previous winner, or enter something else, including a better ASCII translation.

That might work. But this brings up the obvious question: how did you deal with it in the past? I do recall back in 2018 one of the winner’s name was causing a problem. I think his name started with A and perhaps it was Ä ? How did you go about dealing with this? I think it was something that Simon dealt with but perhaps wisdom from that experience might give some clues to how to solve this?

If someone wanted to improve on the UTF-8 to ASCII string map table generated, they could. We would recommend starting off with a basic table, get the code working, and then later improve it.

That always is the way to go I would think but is it actually necessary? Certainly it is necessary if the tools don’t detect invalid characters but if it does and it tells them the problem is it actually needed? I don’t know.

Of course it’s something that can come later or at least after the jauthchk is finished but that’s why I brought up the thought now.

We would want a boolean flag that would indicate if the default winner_handle was used for each author, that is if sometime just pressed return or entered their own value for winner_handle:
"default_author_handle" : true,
"previous_winner" : false,

Okay.

Update: It would be a very good idea if winner_handle was just author_handle.

Do you want me to change the variable name in the tools tomorrow? I am happy to do that. I also think that might be a good idea. Granted I don’t know all you have in mind but if you changed it to be an author handle you might not have to worry about if the person previously won? Or is there something else you have in mind?

lcn2 commented 2 years ago

Do you want me to change the variable name in the tools tomorrow? I am happy to do that. I also think that might be a good idea. Granted I don’t know all you have in mind but if you changed it to be an author handle you might not have to worry about if the person previously won? Or is there something else you have in mind?

Yes, if you please.

lcn2 commented 2 years ago

```c
struct utf8_ascii_map {
    char *utf8_str;    /* UTF-8 string - use \x hex as needed */
    char *ascii_str;   /* ASCII good enough replacement for utf8_str */
};
I could look into this but I would have to actually look into it because I have never played with UTF8 (or any of the others). I have had to use iconv to get xml parsing working but that was a quick command and that was that. It would be good to learn more about this though so maybe after the other tools are done I can look into this.

Let us think some more about this UTF-8 translation.

We are started the formation of a possible translation table. We will be adding that table to util.c. Just the table, not any code that does the translation.

Some util.c function could, given a string, return a malloced default winner_handle value. Such a function would need to have a default for a non-POSIX safe character that was not translated. We recommend that for such non-POSIX safe character that are not translated, that be simply dropped?

Perhaps if there are non-POSIX safe characters that are not translated by the table, then drop them?

Of course, one would need to deal with a case where such character dropping resulted in a empty string and replace that empty string with something. And if the translation produced a winner_handle that was too long, use some form of truncation.

That’s a possibility. It might be okay to simply drop the invalid characters. But of course as you say if they’re all invalid then what?

Return some strdup() string such as --empty--.

This is why I was thinking it might be an idea to tell them the details and let them input it as they would know what is best for their name? I mean it’s likely that they have had this problem before.

Good point.

xexyl commented 2 years ago

Do you want me to change the variable name in the tools tomorrow? I am happy to do that. I also think that might be a good idea. Granted I don’t know all you have in mind but if you changed it to be an author handle you might not have to worry about if the person previously won? Or is there something else you have in mind?

Yes, if you please.

Done. I've not pushed it yet but I will later today.

xexyl commented 2 years ago

Let us think some more about this UTF-8 translation. We are started the formation of a possible translation table. We will be adding that table to util.c. Just the table, not any code that does the translation.

I like what you've done so far!

Some util.c function could, given a string, return a malloced default winner_handle value. Such a function would need to have a default for a non-POSIX safe character that was not translated. We recommend that for such non-POSIX safe character that are not translated, that be simply dropped?

That would be the easiest but would it be the better option? I don't know.

Perhaps if there are non-POSIX safe characters that are not translated by the table, then drop them?

As above the same applies here I think.

Of course, one would need to deal with a case where such character dropping resulted in a empty string and replace that empty string with something. And if the translation produced a winner_handle that was too long, use some form of truncation. That’s a possibility. It might be okay to simply drop the invalid characters. But of course as you say if they’re all invalid then what?

Return some strdup() string such as --empty--.

Or maybe in this case: anonymous?

I mean if they're unwilling to provide a valid author handle do they deserve credit? :)

I'm not entirely serious there but the point is that they should be capable and willing to come up with an author handle: if they've managed to cook up a good entry they should be able to come up with a name for themselves, right?

This is why I was thinking it might be an idea to tell them the details and let them input it as they would know what is best for their name? I mean it’s likely that they have had this problem before.

Good point.

Yeah. This would also make it easier: no extra code would have to be added except a check (if not already there). It could simply ask them to enter an author handle and if they input something invalid explain the rules to them (or maybe explain the general rules to them at first and if they input invalid chars show them the invalid characters and reiterate the issue?). This would remove the burden of coming up with something clever which might be erroneous in some (or all?) cases.

lcn2 commented 2 years ago

I'm not entirely serious there but the point is that they should be capable and willing to come up with an author handle: if they've managed to cook up a good entry they should be able to come up with a name for themselves, right?

Correct

xexyl commented 2 years ago

I'm not entirely serious there but the point is that they should be capable and willing to come up with an author handle: if they've managed to cook up a good entry they should be able to come up with a name for themselves, right?

Correct

So should it just be that it explains to them the details, let them input a value and don’t accept invalid input? That would make it a lot easier and less complex.

lcn2 commented 2 years ago

I'm not entirely serious there but the point is that they should be capable and willing to come up with an author handle: if they've managed to cook up a good entry they should be able to come up with a name for themselves, right?

Correct

So should it just be that it explains to them the details, let them input a value and don’t accept invalid input? That would make it a lot easier and less complex.

On a dialogue change that needs to be made to mkiocccentry:

They should be given a default author_handle. We again refer to this suggested extension to the mkiocccentry dialogue:

For someone who is NOT a previous IOCCC winner:

Enter author name: søm`füll nåmé
...
Enter author affiliation, or press return to skip:
Is the author a previous IOCCC winner [yn]: n
Enter an author handle,
    OR press return to accept [somfull_name]:

For someone who is a previous IOCCC winner:

Enter author name: søm`füll nåmé
...
Enter author affiliation, or press return to skip:
Is the author a previous IOCCC winner [yn]: y
Enter author's previous IOCCC winner handle (if known),
    OR press return to accept [somfull_name]:

Note how the answer new question, a question that needs to be added to mkiocccetry:

Is the author a previous IOCCC winner [yn]:

causes a change on the next question:

Enter an author handle,
    OR press return to accept [somfull_name]:

or

Enter author's previous IOCCC winner handle (if known),
    OR press return to accept [somfull_name]:

But in either case, a default author_handle is suggested to the user.

On an addition to .author.json:

Give the above, we need to note the answer to the Is the author a previous IOCCC winner question:

"previous_winner" : false,

and if the user accepted the default author_handle by pressing return or entered their own string:

"default_author_handle" : true,

for each author in authors array in .author.json.

On the topic of the translation table:

The struct utf8_ascii_map hmap[] in util.c is the start of the table that will do this conversion. It seems to be extended so that multi-byte UTF-8 strings such as Ä, which is are turned info A.

Just before leaving, we added the following example to struct utf8_ascii_map hmap[] in util.c:

    {"\xc3\x84", "A"},          /* A with two dots */
    {"\xc3\xa4", "a"},          /* a with two dots */

The reason why struct utf8_ascii_map hmap[] in util.c is not a simple byte map: why it is a table of 2 strings is because UTF-8 symbols such as Ä are multi-byte strings, not single bytes. When this table is used, a first-sub-string-march approach will have to be used with input. The name will have to be scanned, in table order, for any sub-strings. When a sub-string is found, it is replaced and you move forward. If no sub-string is found in the table, those characters are dropped. This is why, in some cases, the table entry produces an empty string:

    {"~", ""},                  /* ^ */

and in some cases a replacement character:

    {" ", "_"},                 /* SP */

and in a few cases the character is left the same:

    {"0", "0"},                 /* 0 - allowed character */

We think a simple tool can be written to convert such characters (such as Ä) as typed on a keyboard into table entries.

xexyl commented 2 years ago

I'm not entirely serious there but the point is that they should be capable and willing to come up with an author handle: if they've managed to cook up a good entry they should be able to come up with a name for themselves, right?

Correct

So should it just be that it explains to them the details, let them input a value and don’t accept invalid input? That would make it a lot easier and less complex.

They should be given a default author_handle. We again refer to this suggested extension to the mkiocccentry dialogue:

For someone who is NOT a previous IOCCC winner:
Enter author name: søm`füll nåmé
...
Enter author affiliation, or press return to skip:
Is the author a previous IOCCC winner [yn]: n
Enter an author handle,
    OR press return to accept [somfull_name]: 

This would be based on their name, right?

For someone who is a previous IOCCC winner:
Enter author name: søm`füll nåmé ... Enter author affiliation, or press return to skip: Is the author a previous IOCCC winner [yn]: y Enter author's previous IOCCC winner handle (if known), OR press return to accept [somfull_name]:
Note how the answer new question, a question that needs to be added to `mkiocccetry`:
Is the author a previous IOCCC winner [yn]:
causes a change on the next question:
Enter an author handle, OR press return to accept [somfull_name]:
or
Enter author's previous IOCCC winner handle (if known), OR press return to accept [somfull_name]:

Okay. I'll worry about this later though and I'm sure I'll have more questions; indeed one just popped into my head but I'm too unclear right now as last night was absolutely awful (and as you see the pull request I did - that's the last for the day for certain).

But in either case, a default author_handle is suggested to the user.

Okay.

On the topic of the translation table

The struct utf8_ascii_map hmap[] in util.c is the start of the table that will do this conversion. It seems to be extended so that multi-byte UTF-8 strings such as Ä, which is are turned info A.

I was going to ask about that: what it should be. Thanks.

Just before leaving, we added the following example to struct utf8_ascii_map hmap[] in util.c:
    {"\xc3\x84", "A"},          /* A with two dots */
    {"\xc3\xa4", "a"},          /* a with two dots */

You mean Ä and ä, right?

The reason why struct utf8_ascii_map hmap[] in util.c is not a simple byte map: why it is a table of 2 strings is because UTF-8 symbols such as Ä are multi-byte strings, not single bytes. When this table is used, a first-sub-string-march approach will have to be used with input. The name will have to be scanned, in table order, for any sub-strings. When a sub-string is found, it is replaced and you move forward. If no sub-string is found in the table, those characters are dropped. This is why, in some cases, the table entry produces an empty string:
    {"~", ""},                  /* ^ */
and in some cases a replacement character:
    {" ", "_"},                 /* SP */
and in a few cases the character is left the same:
    {"0", "0"},                 /* 0 - allowed character */
We think a simple tool can be written to convert such characters (such as Ä) as typed on a keyboard into table entries.

At my current state all I can think of is:

For each character in the name given, go through the table and find the character and then update it to the replacement (whether it's the same or not). But since some can result in an empty value the handle would have to be dynamically allocated based on the new length.

Would that be sufficient? If not I'll have to reread it but another day.

Thanks for the reply.

lcn2 commented 2 years ago

This would be based on their name, right?

Yes. The default author_handle would be computed from the author name.

xexyl commented 2 years ago

This would be based on their name, right?

Yes. The default author_handle would be computed from the author name.

Just wanted to be sure. Well I'll have a reread another day.

Enjoy your holiday - and make it safe!

lcn2 commented 2 years ago

You mean Ä and ä, right?

Yes, those two, 2-byte UTF-8 stings were just added as table entries.

We can write a tool to produce such things.

... now we really must be going as the song goes. :-)

xexyl commented 2 years ago

You mean Ä and ä, right?

Yes, those two, 2-byte UTF-8 stings were just added as table entries.

So umlaut or diaeresis (same thing).

We can write a tool to produce such things.

Might be an idea.

... now we really must be going as the song goes. :-)

Well I'm afraid I don't know the song but that does not mean very much! I mostly only know one genre and really only one band though it's expanded over the years.

But that doesn't really matter as the point is you have to go.

Be safe!

xexyl commented 2 years ago

As for the ascii table I added a function (called from mkiocccentry_sanity_chk()) that verifies that only the last element has the value of NULL. I also did this for the location table.

I'm not sure if other utils should use these specific tests but I guess they should because if something's wrong with the tables it suggests there's a problem that should be fixed. It's very similar to what I did for the .info.json and .author.json (and their common fields) tables.

I've not committed it and I almost certainly won't today: I just had a moment of energy to do it. I have no plans to do anything else here today but I'm hoping tomorrow I can do more.

lcn2 commented 2 years ago

We were paused at a train station stop with Wi-Fi, so a "snuck" a change to move tables in util.c into their own files. This will also make it easier to update the UTF-8 table (now struct utf8_posix_map) independently.

Leaving the station now .. later! :-)

ioccc-src / mkiocccentry