ioccc-src / temp-test-ioccc

Temporary test IOCCC web site that will go away
Creative Commons Attribution Share Alike 4.0 International
28 stars 6 forks source link

Enhancement: improve the consistency of winning directories #3

Closed lcn2 closed 1 month ago

lcn2 commented 1 year ago

Scope - About this issue

Where reasonable, try to make winning entries consistent.

Consistency includes things such as:

NOTE: This writeup is subject to changes and revisions as the understanding of this issue evolves / improves. Check back to this note from time to time. When modification are made to this comment, we will try to add a note indicating that the scope of this issue has been updated.

TODO

Suggested section:

NOTE: Feel free to come up with better section titles. A better section title may suggest itself as the FAQs are being written as well. NOTE: Once the FAQ is mostly done, ask us add "About the IOCCC" FAQs that have come in over the years. Adding raw detail text for "About the IOCCC" FAQ section is beyond the scope of this issue. The above TODO is intended to not only build out the framework for the FAQ, but to populate it with most of the easier FAQ entries.

NOTE: either search for and find it and add it to the directory - referring to it in the README.md file or note that the image is missing and ask for help in finding it - ask for a pull request to add it as 1984/anonymous/tattoo.jpg.

NOTE: Also consider this as entry for bugs.md. This probably needs we need a Status of "Missing files".

NOTE: This includes: checking that files don't refer to .text files when they should be .markdown or .md or whatever (in fact make it consistent and make sure to change the references to be the new form but make sure to do it carefully). Ideally they link to html files that are generated instead but whether those exist here or not.

NOTE: Regarding the 3 FAQ.md TODOs above:

The audience for the 3 FAQ.md TODOs above are different. The 1st TODO above is a web site change that needs to refer the person to the winner GitHub repo and submitting a pull request. The 2nd TODO above is about how a winner can update their entry by submitting a pull request. The 3rd TODO above is about how someone other than the winner can submit a pull request.

In the case of the 1st " change to the web site" TODO above: one needs to explain that the web site is managed via the GitHub winner repo and that they need to submit a pull request the winner GitHub repo.

In the case of the 2nd " how a winner can submit" TODO above: Winners will be given a "wider set of latitude" when it comes to changing their winning entry, so the bugs.md page is less applicable. Nevertheless the winner will need to submit a pull request against the winner GitHub repo.

In the case of the 3rd "change someone else's entry" TODO above, it should tell the person to be sure to read the bugs.md page as well AND that they will need to issue a pull request against the winner GitHub repo.

NOTE: What is missing is how someone who is reading the web site and who may not be familiar with git / GitHub does a pull request. Such a person needs know how to create a GitHub account if they don't have one, how to fork a repo, how to modify the forked repo on their computer, how to push the changes back to the forked repo, and then how to create a pull request against the winner repo. Such information does NOT belong in the bugs.md page, it belongs on the FAQ.md page, see the next TODO item about GitHub pull requests.

NOTE: It would be a good idea for the bugs.md page should refer someone to FAQ.md page about how to submit a GitHub pull request.

This will involve explaining that because the web site is managed via the winner GitHub repo, to report a problem they need to open an issue against the winner GitHub repo, including create a GitHub account if they don't have one.

It would be a good idea for this FAQ to mention the 'bugs.md' page.

The FAQ should consider how someone who is not familiar with GitHub, can review the list of open winner repo issues and other add a comment to an existing issue or open a new issue.

NOTE: Currently most of the README.md files are just ASCII text with some inconsistent indentation and a few Markdown # added. Come up with a way to display things like the author information, etc. Turn it into a true Markdown file, not simply an ASCII text file. 😜

NOTE: Apply markdown formatting to README.md files, for example turning:

Some text
---------

into a highlighted level 3 or more entry.

NOTE: In places where the author added highlighted text at level 1 to 2, reduce the level to level 3 or more. The start of the "Author's remarks:" should be at level 2. Everything else should be below that level.

NOTE: In cases where shell commands are used, place them under a download code block starting with ```sh.

NOTE: Where text is indented, with the exception of the top level winner "address / URL info" section, consider if such text should not be intended.

NOTE: Long ago when README.md files were just ASCII text, various methods if highlighting (with underscores, ALL CAPS), blank lines, and indenting were used to help "highlight" text. Such ASCII text formatting needs to be converted to use markdown.

We need only two blank lines before and one blank line after the following markdown headers:

And of course, zero blank lines before and only 1 after the top "# Award" line.

We don't need apply these rules to any markdown header lines that the author may have added to their "## Author's remarks:" beyond that which is noted above.

All this will help critical markdown headers in README.md stand out when viewing the raw markdown without adding too many blank lines. It will also help README.md look more consistent.

In a few cases, the README.md uses "Original code", or "Original build", or "Original use", etc. Should those instances use the word "Alternate" instead of "Original"? Maybe or maybe not.

All README.md files (unless there is some very good reason for an exception) must have the following markdown headers:

In some cases these mandatory lines are missing, or there is a typo/different wording used in the README.md file.

In cases where there are NO Author's remarks, the text below the header (after a blank line) should read:

No remarks were provided by the author.

Should the "##Try:" be a mandatory markdown header as well? Perhaps not, but if not then most README.md files should probably have one anyway.

This tool is beyond the scope of this issue and will be part of a tool that builds and maintains the web site. The existing tool under tmp/ will be considered in so far as the existing current status.json file format is kept.

The authorship information that may be found in the SQL data that is referenced in issue #4 should be used to update / correct / improve authorship information. In some cases the authorship information was limited in the original file because there was "who won" elsewhere.

Eventually this authorship information will be auto generated. But until that happens, we need to manually improve the authorship information. Therefore the authorship information needs to be consistent in format:

NOTE: In cases where the author information contains a symbol that markdown would interpret, such as a underscore or backquote, that needs to be modified to prevent accidental markdown information.

NOTE: In some cases when the README.md file was a simple text file, blank lines were sometimes added to help format an author information such as putting an home page URL with a precedent blank line. Such empty or blank lines should be eliminated.

NOTE: Author information should refer to their county. You could use the county code, or the text of the county. You may wish to refer to a file in mkiocccentry repo source code for such translation.

Makefiles

We need to try and make Makefiles reasonably consistent in type and layout.

Pay attention to Makefile variable consistency use.

Pay attention to use of whitespace, especially in comments.

The use of the Makefile variable, ${ECHO} should be replaced with just echo and the ECHO= Makefile variable should be removed.

NOTE: While some warnings cannot be silenced, those that can should be silenced with -Wno-stuff flags added to the ${CSILENCE} Makefile variables.

NOTE: The first "CSILENCE=" line should be setup to silence the generic cc warnings with -Wno-stuff. The 2nd "CSILENCE+=" line in the clang area is for clang. This line appends to the generic cc warnings for clang. The 3rd "CSILENCE+=" line in the gcc area: but if you need to it is there for a final set of changes. That line too appends the generic cc warnings, but in this case go gcc.

NOTE: Be sure that the pfmt (version 1.0.1 2023-03-23 or later) and csilence (1.0.2 2023-07-07 or later) code is up to date.

NOTE: We recommend you first silence warnings on macOS first. Then carry those Makefiles over to Linux and add more -Wno-stuff for linux.

NOTE: We recommend, for a given entry on macOS, that you first clear out all CSILENCE values in all 3 Makefile locations. The use:

csilence -v 1 -c -g clobber all alt

To just print the recommendation for the generic cc CSILENCE first line. Once the Makefile us edited rerun the command. If your edit was successful, no warnings should be printed by the command.

Next run it for clang only:

csilence -v 1 -C -g clobber all alt

To just print the recommendations for clang. Add those to the 2nd clang CSILENCE=+ line. Rerun the command to verify nothing is printed by the command.

Last, try to run the entire set on the entry.

csilence -v 1 clobber all alt

And make any last minute changes that are needed.

Finally check all entries with:

find ???? -mindepth 1 -maxdepth 1 -type d | while read dir; do
    (cd "$dir"; echo =-=-= "$dir" =-=-= ; csilence -v 1 clobber all alt)
done 2>&1 | less

Once that is clean for macOS, copy the makefiles over to Linux and do a similar procedure, but this time DO NOT remove the existing CSILENCE values in the Makefile, just append any new Linux generated values to them.

judges remarks / winner writeup / winner's remarks

The filenames containing " judges remarks / winner writeup / winner's remarks filenames" is all over the map. A common filename should be used, such as README.md.

Wosre still, in some cases there are multiple copies in different formats such as text and html. There should be ONLY one copy of the file, and that should be in Markdown format. For example in 2018/mills we have both hint.html and hint.text. In this case the one file is a copy of the other file. Only one copy should be kept (a quick check should be made to verify that to apparent duplicate files are in fast, duplicates). When it doubt, keep the text of markdown copy. If needed, marge the content into a single file if the copies have substantial differences.

NOTE: Later on, such markdown files will be rendered into html by a site building process. So the html copy should NOT be kept.

In cases where the " judges remarks / winner writeup / winner's remarks filenames" is just hint.text or winner_name.hint, it should be converted into a markdown format file README.md.

The content of having a "hint" file is going away. So hint.txt files, for example, need to become README.md. Such files should not be treated as spoiler files, but other files a user is encouraged to read.

filename extensions

The use of .text extension should be replaced with .txt where reasonable. Be careful when the file is some sort of "data file" where the .text extension may be required.

The use of .makrdown extension should be replaced with .md where reasonable. Be careful when the file is some sort of "data file" where the .markdown extension may be required.

index.html files in winner directories

These files in winner directories will be built by a tool that uses JSON files, HTML templates and the jval and jnamval tools (once they are ready).

This topic is beyond the scope of this issue.

years.hmtl links

This will be built by a tool that uses JSON files, HTML templates and the jval and jnamval tools (once they are ready).

This topic is beyond the scope of this issue.

orig and alt source files

This section has been replased per the comment 1766839666.

Here are the current rules for orig and alt source files:

A manual check to verify that the prog.orig.c has not changed might be a good idea as a sanity check.

If there is a reason to compile the original source code, make a copy (possibly modified copy) to port.alt.c or something like that.

orig and alt source file TODOs

See also comment 1766839666

When the Makefile rules are discussed, also mention make orig_prog_diff and mention make orig_alt_diff suggesting that those rules can help someone understand what changed.

If the README.md needs to discuss specific changes to the entry source code, refer to the appropriate diff rules.

The year level Makefile and the top level Makefile should support running entry level make orig_prog_diff and make orig_alt_diff rules.

See comment 1802524685

See comment 1802524685

Not only should the top section helps emphasize the desire for "low impact" bug fixes, and because some people might not read the top section of that file, every time we ask for someone to help fix a bug we should add a standout line near the top of the entry's bugs.md section, a line alone the lines (pun intended) of:

When fixing this bug, please strive for a "low impact" bug fix to the source code.

Or something short, surrounded by blank lines to help it stand out, near the top of the entry's bugs.md section.

Other TODOs

NOTE: In macOS when trying to run 2013/mills it did not work and macOS did not actually ask if I wanted to allow connections to it.

Move the stuff about compilers into 1986/marshall/compilers.md and add a simple "see also" reference to that new file in 1986/marshall/README.md.

Update bugs.md for the echo "Quartz .. | ./jaw as found in 1990/jaw/README.md as a bug and ask OTHER people to try and fix it.

See comment 1792689994 for more information.

See also comment 1792926524 as well as comment 1792942954.

There is a note in 1990/jaw/README.md about the echo "Quartz .. | ./jaw" that is actually a bug. Ask OTHER people to try and fix it by adding an entry to bugs.md and making the appropriate edits to 1990/jaw/README.md.

See comment 1901172231 and comment 1901148133.

Final TODOs

See comment 1792961765

See comment 1821159601 for more information.

This has been done in the fast a few times, but as entries have been modified a final check using the csilence tool should be done. See the text under the previous "Silence any compiler warnings" TODO for details.

NOTE: The above final check TODO will be done by @lcn2.

Perform a last minute quick scan to see of anything major / anything critical / anything glaring / anything important was missed.

See also:

xexyl commented 1 year ago

Thanks. Do you have an example file I can take a look at ? (It sounds like it's not all files so might be easier to just have an example file name to check.)

Any file that is part of entry in the directory should be listed in the entry's set of URLs in the top level years.html page.

This is why if you add, rename or delete files in a direct entry, the corresponding side of your ells on the years.html page needs to be updated as well.

Kind of like how I did in commit d8441655103fcd82233f5a07abd23c62f2c6814c ?

Once issue #4 is resolved, each directory entry will have a JSON file that contains a manifest. When we have all those manifests, the next tool can be written (we anticipate it being written in a language such as Perl or ac) to generate the years.html file automatically. Until we have those Jason files, we will have to manually maintain the years.html file.

It sounds like it might be better to work on that issue first (rather than manually updating the years.html files). I'll be addressing any comments there in a bit but it would seem like it would save time and effort than to update html files manually. In some cases it's easy enough but in other cases it might be annoying.

As far as what should be in the manifest, at this point, everything in the directory should be listed now. And the corresponding entries in the years.html page need to be updated accordingly.

If I understand right it would be something like:

$ grep ferguson2 MANIFEST 
2020/ferguson2/Makefile
2020/ferguson2/cake.jpg
2020/ferguson2/chocolate-cake.md
2020/ferguson2/enigma.1
2020/ferguson2/README.md
2020/ferguson2/input.txt
2020/ferguson2/ioccc-cake.jpg
2020/ferguson2/obfuscation.key
2020/ferguson2/obfuscation.md
2020/ferguson2/obfuscation.txt
2020/ferguson2/prog.c
2020/ferguson2/recode.1
2020/ferguson2/recode.c
2020/ferguson2/recode.md
2020/ferguson2/try.this.txt

but a JSON file that is much like the manifest member in the .info.json files ?

I would think (though this is at first glance without any thought about it) that it might be kind of easy to extract the files - at least those years that have a manifest - to create a file. Problem is most years don't have a manifest.

If however you mean something else please clarify.

xexyl commented 1 year ago

One thing we could do now, is to generate an initial JSON file for all of the existing winning entries as they are now. Here we are talking about the Jason files for each winning directories only (not the author information).

Do you want us to do that?

Please. That might help get things moving.

xexyl commented 1 year ago

Moved comment from issue #2

One idea is to build a set of year level directories that contain the original copies of all entries. So there would be a 1984 directory tree and a orig.1984 tree.

Or maybe 1984.orig to match the style of prog.orig.c that exists in some winning entries ? Also would help (I think) with sorting the directories - keep the years together.

BTW: The orig.1984 tree would not be a part of the repo (instead there would be an orig.1984.txz file that would be part of the repo). See UPDATE 0b below.

Okay.

Instead of trying to keep a mix of original and modified files in the same winning directories, we could simply have a parallel tree containing the originals.

That sounds like it might be easier to do and also less error prone at least once a system is in place.

Now there are a few minor decisions that we need to be made if we go down this path, but those are refinements of the idea, not complications nor problems with it.

Okay.

UPDATE 0b

We could just simply declare the existing tarballs for each year as containing the original files. So instead of having an orig.1984 tree, we would simply provide a orig.1984.txz file that would untar into a orig.1984 directory. If we did this, we wouldn't need to worry about maintaining navigational links into such directories. Here's a record would be preserved, just not in a super convenient web friendly form.

Same thought with the name.

I think a web friendly way would be better if we're going to go through all this effort. That's my view. It would make it nicer for the viewer and isn't that the purpose of this repo in the first place - to make it work better and to try and get more entries that no longer compile to compile so people can enjoy them?

The orig.YYYY.txz file would reside as a file under the YYYY directory. It would untar as orig.YYYY. We could put orig.YYYY under a .gitignore file so that we could easily have the tree untar-ed your reference purposes for ourselves and not impact the web site nor GitHub repo.

The orig.YYYY.txz file would be part of the GitHub repo and be linked to in the years.html near the top of the given year. But then we would remove links to other "orig" files links from years.html.

The existing "orig" files could be deleted, for the most part, since there would be a copy somewhere in the orig.YYYY.txz file.

I think removing the orig files that already exist would be a mistake because it's nice to look at an entry and see side by side the changes (well kind of - I mean in the case of most it might not be easy to see the changes but one can at least have the different versions side by side).

One minor point, for a few entires the so-called ALT build isn't an original source, it is a 2nd copy provided by the author. For example, some winners provided an unobstructed version of their code and some winners provided an obfuscated copy with additional functionality that was over the size limit (so they provided it as an auxiliary file). In these cases the "orig" file isn't an original but rather an alternative copy.

Yes and as noted I did that with Snake - for same reason ... size limit. Added some additional features.

What do you think of that, @xexyl ?

See above thoughts. I think that it should be web friendly for one thing and the rest I think is important too.

UPDATE 1

If we did, according to UPDATE 0b above, we would no longer need to maintain original copies in the winning directories, and we would no longer need to try to have Makefile rules that tries to compile (and often fails) such original code.

The latter part of the last sentence does sound good though. But do we have to bother with rules that might fail? I mean to say do we need a Makefile for original directories? It might just be for historical purposes. Even so I think side by side is better, personally.

Now there may be a few minor decisions. We have to make if we wait along this route.

What do you think?

See above again.

UPDATE 2

This proposal has implications for issue #1 and issue #2. To some extent the JSON manifests mentioned in issue #4 are also touched.

This is because these for issues are recently interconnected.

So we need to make some early decisions about this now as this will likely impact the other issues going forward.

Hopefully my thoughts above are of help but I'll certainly clarify more if necessary.

As for replying to the rest of the updated issue (top comment) I guess I don't need to. I'll hold off for now and see how it goes. I'll address the other issues after a very short break.

Did I address everything here ?

lcn2 commented 1 year ago

Do you mean that it is built that way or rather that another tool would be a wrapper to it ? Just curious on a technical level as I never looked at the source of rdiscount.

We will always have a wrapper for tools as this allows one to handle special cases, or to exclude special cases, special handling, or at a minimum to default to certain paths and stuff.

lcn2 commented 1 year ago

Right. In other words the only html file that should be removed is that which corresponds to the hint/markdown file, right?

Correct.

See 77fb0f1, 851e191 and 5850cf9 please.

I just noticed that there are some hint files that were not named 'hint.text' but rather by the author. I will have to think of a way to find the remaining files.

In the meantime the ${ECHO} has been updated to be just echo in commit 046692c. I did NOT do it in hint files which is important as it would break things. For instance the script in 1998/dlowe/dlowe.hint.

Now that does bring up a point. Can you think of a way to test all Makefiles ? Problem is that not all entries will compile so it might be difficult. Perhaps using sed there is no need to do this - yet - but it might be something we need to do later and probably for each major change.

The top level Makefile calls all year based Makefiles who is turn call Makefiles in each individual winning directory. Every Makefile rule should be available at every Makefile directory level. Thus it is trivial to make stuff as much or as little as you wish.

It this is not obvious, we invite you to tour the Makefiles and see. And yes, there may be mistakes .. that is what this issue #3 is designed to address.

OK: There are special rules for special entries that do special stuff. We don't refer to those rules, but rather to the main set. That is, every Makefile rule at the very to Makefile MUST me available at every YYYY based Makefile AND MUST be available at every winning entry directory Makefile.

xexyl commented 1 year ago

Right. In other words the only html file that should be removed is that which corresponds to the hint/markdown file, right?

Correct.

See 77fb0f1, 851e191 and 5850cf9 please.

I just noticed that there are some hint files that were not named 'hint.text' but rather by the author. I will have to think of a way to find the remaining files. In the meantime the ${ECHO} has been updated to be just echo in commit 046692c. I did NOT do it in hint files which is important as it would break things. For instance the script in 1998/dlowe/dlowe.hint. Now that does bring up a point. Can you think of a way to test all Makefiles ? Problem is that not all entries will compile so it might be difficult. Perhaps using sed there is no need to do this - yet - but it might be something we need to do later and probably for each major change.

The top level Makefile calls all year based Makefiles who is turn call Makefiles in each individual winning directory. Every Makefile rule should be available at every Makefile directory level. Thus it is trivial to make stuff as much or as little as you wish.

It this is not obvious, we invite you to tour the Makefiles and see. And yes, there may be mistakes .. that is what this issue #3 is designed to address.

OK: There are special rules for special entries that do special stuff. We don't refer to those rules, but rather to the main set. That is, every Makefile rule at the very to Makefile MUST me available at every YYYY based Makefile AND MUST be available at every winning entry directory Makefile.

I have indeed looked at that Makefile and I have also run make from the root directory.

I will be going to sleep soon but I will reply to comments tomorrow including this one properly if needed.

Hope you have a great night!

lcn2 commented 1 year ago

Right. In other words the only html file that should be removed is that which corresponds to the hint/markdown file, right?

Correct.

See 77fb0f1, 851e191 and 5850cf9 please.

I just noticed that there are some hint files that were not named 'hint.text' but rather by the author. I will have to think of a way to find the remaining files.

Commit 17ea7fb takes care of this except for one file:

$ find . -name '*.hint' -execdir git mv '{}' README.md \;
fatal: destination exists, source=1992/albert/albert.hint, destination=1992/albert/README.md 

and unfortunately the files differ quite a bit so I'm not sure what to do.

UPDATE 0

The README.md was only an email from Leo to you, before he was a judge (presumably :-) ), about a bug. This meant that the index file did not have information about the entry. Thus I renamed the README.md so that the hint file could become README.md. The old contents were renamed to albert.bug.md but a better name might be contrived. I leave that to you though I'm happy to provide thoughts on it.

Well you need to fix such things. Perhaps rename the original README.md needs to be renamed to something else.

lcn2 commented 1 year ago

Removed them - at least the obvious ones. The extra html files were of course not removed. A question is which html files besides the hint.html can be removed though? I just had the thought that just because there's an author.html file does not mean that it is supposed to be the html of the markdown.

What do you think should be done if any of these exist?

If the contents are important (and given your general example, it sounds like they are) but the form (for example HTML via markdown) or filename (such as a existing README.md file that isn't the file we want) then the file needs to be transformed and renamed as needed.

lcn2 commented 1 year ago

Well I believe those already are meant to be markdown so is there anything that needs to be done there? After all define markdown: it can be just text too. Also what formatting? I guess one could look at others but which are correct?

Well the file should be renamed into a markdown file (such as README.md), then the contents have to be consider from the point ion view of consistency of style and format across all entries (such as spellchecking if needed, formatting, consistent layout, names of sections, heading levels, etc.).

lcn2 commented 1 year ago

I remembered I had other comments to reply to first so I'll do that before I reply to the rest of the updated issue.

Oh and as far as alt versions: I think my Snake entry has am alt version. It certainly has a number of versions but I think one of them is alt. Anyway I will reply properly tomorrow. I hope you have a great night!

Yes, so you understand that not every "orig" can be moved/removed. Someone has to go thru each winning entry and make a decision.

That makes sense yes. I think getting rid of such files would be risky anyway. Don't you agree?

No. Judgement calls have to be made. The "orig" files that were crated to preserve the original value will instead be found within a compressed tarball archive. The entry documentation will make it clear what is going on. For example the hint file might indicate that the alternate file was provided as a version that is too large for the rules but contains extra features, or is the un-obfuscated version, etc.

There won't be a mass fix where some shell command will do anything you want. On the other hand has you are fixing issue #2 typos, you can read the documentation and determine if the "orig" files are needed.

When in doubt, you can always ask for a suggestion. Worst case, the file will still exist in the archive that will be a compressed tarball.

BTW: We plan to create a top level directory called archive/ under files with names such as archive/archive-YYYY.tgz will exist. Those tarballs, when uncompressed and extracted will create archive/YYYY/ trees. The c will NOT be part of the archive, only the archive/archive-YYYY.tgz files. The archive/YYYY/ trees will be excluded in the .gitignore file so that YOU can extract the contents of the archive/archive-YYYY.tgz files .. for purposes of comparison if needed.

UPDATE 0

We will have to address the many questions later. But in short, expect more answers of the form:

UPDATE 1

Expect to find patterns of the types of changes that will be required. This is because the former judge was prone to starting to make a change, picking some starting spot, going thru a few years and then never finishing. Only to start a different style of change at some other spot, going thru a few years and stopping again.

Expect an inconsistent mess.

lcn2 commented 1 year ago

Or maybe 1984.orig to match the style of prog.orig.c that exists in some winning entries ? Also would help (I think) with sorting the directories - keep the years together.

We have abandoned this idea in favor of having a single top level archive/ directory containing only archive-YYYY.tgz files. These will uncompressed into archive/YYYY/ trees, however the archive/YYYY/ trees will NOT be part of the repo, nor visible on the web site. The archive/YYYY/ trees will be excluded from the repo via the .gitignore file.

lcn2 commented 1 year ago

I think removing the orig files that already exist would be a mistake because it's nice to look at an entry and see side by side the changes (well kind of - I mean in the case of most it might not be easy to see the changes but one can at least have the different versions side by side).

We disagree. The original files are a "hodge-podge" mess. The vast majority need to go away. Nevertheless if a mistake is discovered (say it turns out that some prog.orig.c file is actually an alternate copy of then entry) this can be corrected because we still have the compressed archives to recover form.

lcn2 commented 1 year ago

What files? The renamed / moved / new files? And in what way updated? I guess it depends on the type of file so maybe you have an example so I can figure out what you are getting at?

It will depend on the situation. And the situation s will vary depending of the entry and how many times that former judge did something semi-inconsistent. :-(

lcn2 commented 1 year ago

How would this change the updating / renaming / etc. the html (and other) files ?

It depends on the approach. If we build the .winner.json files for each entry soon, then one would simply adjust that file when files are added / renamed / removed. If we have a tool to build such a manifest, then it might be as simple as running the tool on a modified entry (and rebuild the .winner.json file for that entry).

We are having problems finding the time to do the work to generate the archive files and the JSON files (answering questions seems to be a higher priority), but we hope to make progress soon!

lcn2 commented 1 year ago

Kind of like how I did in commit d844165 ?

Yes 👍✨

lcn2 commented 1 year ago

but a JSON file that is much like the manifest member in the .info.json files ?

Yes

xexyl commented 1 year ago

Right. In other words the only html file that should be removed is that which corresponds to the hint/markdown file, right?

Correct.

See 77fb0f1, 851e191 and 5850cf9 please.

I just noticed that there are some hint files that were not named 'hint.text' but rather by the author. I will have to think of a way to find the remaining files.

Commit 17ea7fb takes care of this except for one file:

$ find . -name '*.hint' -execdir git mv '{}' README.md \;
fatal: destination exists, source=1992/albert/albert.hint, destination=1992/albert/README.md 

and unfortunately the files differ quite a bit so I'm not sure what to do.

UPDATE 0

The README.md was only an email from Leo to you, before he was a judge (presumably :-) ), about a bug. This meant that the index file did not have information about the entry. Thus I renamed the README.md so that the hint file could become README.md. The old contents were renamed to albert.bug.md but a better name might be contrived. I leave that to you though I'm happy to provide thoughts on it.

Well you need to fix such things. Perhaps rename the original README.md needs to be renamed to something else.

I renamed the email only to something like README.bug.md and the original README.md (with details on the entry) is now README.md.

I think that's the better way to go about it because when someone looks at the entry they don't care probably about the email of a bug but rather want to see the entry.

It might be in some cases that such comments could be added under the judges' comments and above the author's comments but I don't know really.

xexyl commented 1 year ago

Removed them - at least the obvious ones. The extra html files were of course not removed. A question is which html files besides the hint.html can be removed though? I just had the thought that just because there's an author.html file does not mean that it is supposed to be the html of the markdown. What do you think should be done if any of these exist?

If the contents are important (and given your general example, it sounds like they are) but the form (for example HTML via markdown) or filename (such as a existing README.md file that isn't the file we want) then the file needs to be transformed and renamed as needed.

Well in the ferguson1 and ferguson2 entries in 2020 for example I renamed the .text extension to .md. But I kept the html files because I'm not sure if these markdown files will generate html files automatically and it's important that they're generated.

But in this case I believe I was referring to the situation where some entries had a file in the form of: author.text (or whatever extension). This might be changed to README.md and it was done (well actually I probably m missed some but I don't know for sure). But what if there is also a file that is supposed to remain the same (name) by author.text or author.html or whatever ? More importantly how can we confirm it with 100% certainty ?

xexyl commented 1 year ago

Well I believe those already are meant to be markdown so is there anything that needs to be done there? After all define markdown: it can be just text too. Also what formatting? I guess one could look at others but which are correct?

Well the file should be renamed into a markdown file (such as README.md), then the contents have to be consider from the point ion view of consistency of style and format across all entries (such as spellchecking if needed, formatting, consistent layout, names of sections, heading levels, etc.).

Though it might be difficult to do in some cases again because what if the typo was a joke / pun / whatever that I'm unaware of? Point is some are obvious but not all will be: what to do about those unobvious ones ? I think better to not touch them but that's me.

As far as consistency across all entries of style and format: I guess you mean the style of the markdown and not the wording etc. ? That's what it seems like but I want to be sure because I would think the latter is not something that should be done for quite a few reasons. But markdown yes. I could gather a list of each README.md file in a text file and then go down the list of files and edit each one (or view in many cases I suppose) so that I know the general format of the markdown.

Of course a way to test the generated output would be good which is another reason I think if we could have a test subdomain or subdirectory for this repo. Of course it would be possible to look at the absolute URL on GitHub but that won't show the IOCCC specific data on the page which might be useful to have.

xexyl commented 1 year ago

I remembered I had other comments to reply to first so I'll do that before I reply to the rest of the updated issue.

Oh and as far as alt versions: I think my Snake entry has am alt version. It certainly has a number of versions but I think one of them is alt. Anyway I will reply properly tomorrow. I hope you have a great night!

Yes, so you understand that not every "orig" can be moved/removed. Someone has to go thru each winning entry and make a decision.

That makes sense yes. I think getting rid of such files would be risky anyway. Don't you agree?

No. Judgement calls have to be made. The "orig" files that were crated to preserve the original value will instead be found within a compressed tarball archive. The entry documentation will make it clear what is going on. For example the hint file might indicate that the alternate file was provided as a version that is too large for the rules but contains extra features, or is the un-obfuscated version, etc.

And what if the author or judges did not provide such information ? How to make it consistent wording ? Well if you do it that would be consistent but not if we do it in author's comments. But then we have to make sure that they did do it. Another problem is that if they include it later on and people don't want to read that far (I'm a great example of this one).

As far as .orig files: that's one thing but I hope you don't put the alt versions in that archive.

Still even if the entry documentation will have the details about the orig it can be nice to have them for side-by-side viewing. I think that's something to consider. I certainly have done that anyway. Now granted I could always untar the tarballs and probably would but having them in the same directory is nice.

There won't be a mass fix where some shell command will do anything you want. On the other hand has you are fixing issue #2 typos, you can read the documentation and determine if the "orig" files are needed.

Hmm ... yes though on the latter part that isn't always clear. Plus define 'need'. How to go about it too? For example let's say I decide that a .orig.c file is not needed. What do I do?

When in doubt, you can always ask for a suggestion. Worst case, the file will still exist in the archive that will be a compressed tarball.

Okay.

BTW: We plan to create a top level directory called archive/ under files with names such as archive/archive-YYYY.tgz will exist. Those tarballs, when uncompressed and extracted will create archive/YYYY/ trees. The c will NOT be part of the archive, only the archive/archive-YYYY.tgz files. The archive/YYYY/ trees will be excluded in the .gitignore file so that YOU can extract the contents of the archive/archive-YYYY.tgz files .. for purposes of comparison if needed.

The c? Do you mean the C code? I guess not (since it is after all a C contest) but I'm not sure what you mean.

I still am not sure about this though: I like to have them in the same directory as they keep things together. If archive tarballs have the orig files and they do not extract to the same directory that means the user will have to navigate away. Worse is that for the website it'll be harder to do that.

UPDATE 0

We will have to address the many questions later. But in short, expect more answers of the form:

Thanks. In some cases I might hold off until you have made these decisions.

UPDATE 1

Expect to find patterns of the types of changes that will be required. This is because the former judge was prone to starting to make a change, picking some starting spot, going thru a few years and then never finishing. Only to start a different style of change at some other spot, going thru a few years and stopping again.

That sounds nightmarish.

Expect an inconsistent mess.

Is that any different from now? :-) But it's okay. It's expected in large projects. Not always but often. Besides it should be inconsistent and arguably messy - it's not the International Un-obfuscated C Code Contest but the International Obfuscated C Code Contest! :-)

xexyl commented 1 year ago

Or maybe 1984.orig to match the style of prog.orig.c that exists in some winning entries ? Also would help (I think) with sorting the directories - keep the years together.

We have abandoned this idea in favor of having a single top level archive/ directory containing only archive-YYYY.tgz files. These will uncompressed into archive/YYYY/ trees, however the archive/YYYY/ trees will NOT be part of the repo, nor visible on the web site. The archive/YYYY/ trees will be excluded from the repo via the .gitignore file.

Well I think it's a mistake to not have it as part of the website for the reasons I noted above. It puts a burden on the viewer who wants the markup version. Now they have to download the repo and then extract the tarball and then navigate to the directory and only then find the original files.

That's my view anyway as someone who has looked at the orig files including on the website.

xexyl commented 1 year ago

I think removing the orig files that already exist would be a mistake because it's nice to look at an entry and see side by side the changes (well kind of - I mean in the case of most it might not be easy to see the changes but one can at least have the different versions side by side).

We disagree. The original files are a "hodge-podge" mess. The vast majority need to go away. Nevertheless if a mistake is discovered (say it turns out that some prog.orig.c file is actually an alternate copy of then entry) this can be corrected because we still have the compressed archives to recover form.

Hmm .. I don't see it as a hotchpotch at all but rather a useful thing to see. Let's say that the judges make a fix and then change the entry name to be prog.orig.c. It's nice to see what the winner submitted and to easily compare what the judges did to fix it.

Now if you are insistent on removing the orig files that's one thing but personally I do look at both and having to extract a tarball just to see it (and not in the same directory) moves it further away from the context entry. I also as I said have viewed orig files on the website and you don't want that now.

*shrug*

xexyl commented 1 year ago

What files? The renamed / moved / new files? And in what way updated? I guess it depends on the type of file so maybe you have an example so I can figure out what you are getting at?

It will depend on the situation. And the situation s will vary depending of the entry and how many times that former judge did something semi-inconsistent. :-(

I no longer know the context to this one. But I'll have to look back anyway and maybe it's not decided finally either.

xexyl commented 1 year ago

How would this change the updating / renaming / etc. the html (and other) files ?

It depends on the approach. If we build the .winner.json files for each entry soon, then one would simply adjust that file when files are added / renamed / removed. If we have a tool to build such a manifest, then it might be as simple as running the tool on a modified entry (and rebuild the .winner.json file for that entry).

We are having problems finding the time to do the work to generate the archive files and the JSON files (answering questions seems to be a higher priority), but we hope to make progress soon!

I hope so because I think this would be the ideal way to go about it. It would mean we wouldn't have to do extra work and (worse) possibly go back and make changes after the other issue was fixed.

lcn2 commented 1 year ago

Now if you are insistent on removing the orig files that's one thing but personally I do look at both and having to extract a tarball just to see it (and not in the same directory) moves it further away from the context entry. I also as I said have viewed orig files on the website and you don't want that now.

shrug

We don't think, for example, that you would want the original source for your modified winning code and remarks easily available. You convinced us to go the route of archival compressed tarballs ... and the git history of the repo.

Going forward, modifications by a new winner will be made via pull requests. The web site will reflect the top of the master branch. For a new IOCCC winners, the act of willing will come in the form of a pull request for a new YYYY year directly. No more waiting for a former judge to procrastinate: no more long delays between annoying the winners and the source being available. New willers will go live on the web site, via a pull request immediately. Edits to code (say bug fixes and improvements) and edits to README.md files will be made via pull requests: all reflected on the web site after a minute for two while GitHub renders the web pages.

Older code and file versions will be found in the git logs. But the easy to see web site and entry packages will be the top of the master tree.

xexyl commented 1 year ago

Now if you are insistent on removing the orig files that's one thing but personally I do look at both and having to extract a tarball just to see it (and not in the same directory) moves it further away from the context entry. I also as I said have viewed orig files on the website and you don't want that now. shrug

We don't think, for example, that you would want the original source for your modified winning code and remarks easily available. You convinced us to go the route of archival compressed tarballs ... and the git history of the repo.

For the remarks that's true. I'm not sure about the code though. At least - and I wonder something actually.

Are we talking about the prog.orig.c that already exists in some entries? Or are we talking about something else?

Going forward, modifications by a new winner will be made via pull requests. The web site will reflect the top of the master branch. For a new IOCCC winners, the act of willing will come in the form of a pull request for a new YYYY year directly. No more waiting for a former judge to procrastinate: no more long delays between annoying the winners and the source being available. New willers will go live on the web site, via a pull request immediately. Edits to code (say bug fixes and improvements) and edits to README.md files will be made via pull requests: all reflected on the web site after a minute for two while GitHub renders the web pages.

Yeah that was a bit frustrating indeed. But as for prog.orig.c: traditionally it was for if you made a minor change or the author made a minor change. Perhaps there's this: if there already are these then they can be kept. But since git will be involved in the future contests (well you know what I mean - we also used git with a winner repo in the past) they wouldn't have to do it?

Just a thought.

Older code and file versions will be found in the git logs. But the easy to see web site and entry packages will be the top of the master tree.

That's true.

lcn2 commented 1 year ago

And what if the author or judges did not provide such information ? How to make it consistent wording ? Well if you do it that would be consistent but not if we do it in author's comments. But then we have to make sure that they did do it. Another problem is that if they include it later on and people don't want to read that far (I'm a great example of this one).

If in the rare case that the "ALT" file (renamed as orig) is needed, the mistake can be easily fixed by pulling in the file from the archive. In order to do issue #2, you will have to read the remarks / hint / now README.md files. Such a rare thing will be obvious from the entires documentation.

xexyl commented 1 year ago

And what if the author or judges did not provide such information ? How to make it consistent wording ? Well if you do it that would be consistent but not if we do it in author's comments. But then we have to make sure that they did do it. Another problem is that if they include it later on and people don't want to read that far (I'm a great example of this one).

If in the rare case that the "ALT" file (renamed as orig) is needed, the mistake can be easily fixed by pulling in the file from the archive. In order to do issue #2, you will have to read the remarks / hint / now README.md files. Such a rare thing will be obvious from the entires documentation.

I'm not sure how this is related to alt file though. I think what I was getting at is if the information is later on in the file (or another file .. I'm probably guilty of the latter and I'm definitely guilty of the former) it might be difficult to determine. Well that part could probably be done with grep. But not so for the source code: what if some entry needs the file? We can't guarantee that we can easily find the filename just by looking at the code.

lcn2 commented 1 year ago

As far as .orig files: that's one thing but I hope you don't put the alt versions in that archive.

Still even if the entry documentation will have the details about the orig it can be nice to have them for side-by-side viewing. I think that's something to consider. I certainly have done that anyway. Now granted I could always untar the tarballs and probably would but having them in the same directory is nice.

It will be effectively the compressed tarballs that are on the main web site now.

We also believe that 99.999% or more of folk won't care about the original unmodified code. The compressed tarball is only there for those few "archivists" who are curious. Besides the convince of the archive/archive-YYYY.txz files, such rare folks can always use the git logs or even the so-called "way back" Internet archive.

We don't suggest you stress too much about removing "orig" files from the web site. Should the rare case of a current "orig" file is needed because the documentation says it was an alternate source, the mistake can easily be fixed.

Let's get the web site working for 99.999% of the folks who want to see the current best view of the code that won.

lcn2 commented 1 year ago

Though it might be difficult to do in some cases again because what if the typo was a joke / pun / whatever that I'm unaware of? Point is some are obvious but not all will be: what to do about those unobvious ones ? I think better to not touch them but that's me.

Just try your best. When in doubt feel free to ask. In most all cases it will be obvious given the context.

xexyl commented 1 year ago

As far as .orig files: that's one thing but I hope you don't put the alt versions in that archive. Still even if the entry documentation will have the details about the orig it can be nice to have them for side-by-side viewing. I think that's something to consider. I certainly have done that anyway. Now granted I could always untar the tarballs and probably would but having them in the same directory is nice.

It will be effectively the compressed tarballs that are on the main web site now.

Okay though for now it's still not clear to me exactly what you have in mind.

We also believe that 99.999% or more of folk won't care about the original unmodified code. The compressed tarball is only there for those few "archivists" who are curious. Besides the convince of the archive/archive-YYYY.txz files, such rare folks can always use the git logs or even the so-called "way back" Internet archive.

Okay you have me there. You're probably right ... most people probably won't care. I'm just a - I don't want to call it a purist but details and history are all important to me in a project.

We don't suggest you stress too much about removing "orig" files from the web site. Should the rare case of a current "orig" file is needed because the documentation says it was an alternate source, the mistake can easily be fixed.

It's true it can be easily fixed with git.

Let's get the web site working for 99.999% of the folks who want to see the current best view of the code that won.

Sure.

lcn2 commented 1 year ago

But in this case I believe I was referring to the situation where some entries had a file in the form of: author.text (or whatever extension). This might be changed to README.md and it was done (well actually I probably m missed some but I don't know for sure). But what if there is also a file that is supposed to remain the same (name) by author.text or author.html or whatever ? More importantly how can we confirm it with 100% certainty ?

Nothing this size will ever be but free. Still with issue #1, issue #2, issue #3 the site will be a lot better. Any additional improvements can be easily done via pull requests by anyone after the main web site is updated.

xexyl commented 1 year ago

But in this case I believe I was referring to the situation where some entries had a file in the form of: author.text (or whatever extension). This might be changed to README.md and it was done (well actually I probably m missed some but I don't know for sure). But what if there is also a file that is supposed to remain the same (name) by author.text or author.html or whatever ? More importantly how can we confirm it with 100% certainty ?

Nothing this size will ever be but free. Still with issue #1, issue #2, issue #3 the site will be a lot better. Any additional improvements can be easily done via pull requests by anyone after the main web site is updated.

True. Okay.

xexyl commented 1 year ago

Though it might be difficult to do in some cases again because what if the typo was a joke / pun / whatever that I'm unaware of? Point is some are obvious but not all will be: what to do about those unobvious ones ? I think better to not touch them but that's me.

Just try your best. When in doubt feel free to ask. In most all cases it will be obvious given the context.

My guess is also that it would usually be fairly obvious.

lcn2 commented 1 year ago

But in this case I believe I was referring to the situation where some entries had a file in the form of: author.text (or whatever extension). This might be changed to README.md and it was done (well actually I probably m missed some but I don't know for sure). But what if there is also a file that is supposed to remain the same (name) by author.text or author.html or whatever ? More importantly how can we confirm it with 100% certainty ?

Nothing this size will ever be but free. Still with issue #1, issue #2, issue #3 the site will be a lot better. Any additional improvements can be easily done via pull requests by anyone after the main web site is updated.

True. Okay.

We know you can / will / are doing a MUCH better job than for former judge in editing the web site. And for that we are extremely appreciative.

We will have to be sure that the top level README.md credits you for all if your hard work.

xexyl commented 1 year ago

But in this case I believe I was referring to the situation where some entries had a file in the form of: author.text (or whatever extension). This might be changed to README.md and it was done (well actually I probably m missed some but I don't know for sure). But what if there is also a file that is supposed to remain the same (name) by author.text or author.html or whatever ? More importantly how can we confirm it with 100% certainty ?

Nothing this size will ever be but free. Still with issue #1, issue #2, issue #3 the site will be a lot better. Any additional improvements can be easily done via pull requests by anyone after the main web site is updated.

True. Okay.

We know you can / will / are doing a MUCH better job than for former judge in editing the web site. And for that we are extremely appreciative.

I appreciate the kind words!

We will have to be sure that the top level README.md credits you for all if your hard work.

Thank you! That's really special!

lcn2 commented 1 year ago

We will have to be sure that the top level README.md credits you for all if your hard work.

Thank you! That's really special!

BTW: If you do end up using / checking on Yusuke Endoh's work, we need to also credit him for Makefile/source code bug fixes.

xexyl commented 1 year ago

We will have to be sure that the top level README.md credits you for all if your hard work.

Thank you! That's really special!

BTW: If you do end up using / checking on Yusuke Endoh's work, we need to also credit him for Makefile/source code bug fixes.

I believe I stated that initially because it most certainly is needed yes.

xexyl commented 1 year ago

Okay so I just did a fix in 2020/carlini. Well some fixes.

But I also encountered something that needs to be decided.

It's not just README.md files: there are also some README files without any extension. At first glance it would make sense to just add the extension but there are some cases where this is not possible. In particular 2018/burton2/README. This is because that entry already has a README.md file. And indeed any README file under a winning directory cannot be renamed to README.md.

So what to do ?

I'm going to do something else now I think.

Hope you'll be feeling better soon!

xexyl commented 1 year ago

Oh one more thing. There are a lot of references to EMail instead of Email or just email. Although the first one was at a time common (iirc) it's not now. We could change it to be either email or e-mail (or if starting a sentence capitalise it).

Which do you prefer? Care would have to be taken in some cases because it's also in some code (though not entries) but that should not be a problem.

xexyl commented 1 year ago
Judges' comments:

    To build:
        make adrian

    Try:
        adrian adrian.grep.try < adrian.hint

    For the slow minded, try:
        adsleep 32767

    Once you get past the obfuscation, you have an opportunity to learn
    about regular expressions and state machines.

    NOTE: Some compilers have had trouble optimizing this entry.

Now this is an unfortunate thing. It will take some time to figure out which files to update. Some refer to prog, some refer to the name of the entry (like my 2018/weasel entry) and others refer to the author's name. What should it be? In some cases it might be nice to have the entry name but I think that the best way to maintain that would be to make it a symlink if it exists (for example I didn't do that in the weasel entry .. that was your doing possibly because I added the rpm spec file for fun though maybe not) but make it prog as the main binary. In fact that might be what you did anyway.

That's less of an issue though. Another issue is that if one gets this entry to compile the usage will not work generally because . is not in the path. In recent years you added ./ to the invocations but not all years so this also has to be updated.

In order to do that though I need to know what you prefer for style here. There's an argument for making it the winner's name but there's a stronger argument for making it prog because then it's the same for every entry - consistent so that the viewers don't have to guess. In the case where an entry relies on the executable being a specific name it can be a symlink.

What do you think ?

... and now I really am going to do other things!

UPDATE 0

Also what should I change adsleep to ?

UPDATE 1

I'm also not sure what to do about this fact:

W= fopen(wc>= 2 ? V[1] : "adgrep.c","rt");

I won't touch that code but that file does not exist. Is that an error ? If so what should it be ?

UPDATE 0

Year is 1992.

lcn2 commented 1 year ago

Oh one more thing. There are a lot of references to EMail instead of Email or just email. Although the first one was at a time common (iirc) it's not now. We could change it to be either email or e-mail (or if starting a sentence capitalise it).

Which do you prefer? Care would have to be taken in some cases because it's also in some code (though not entries) but that should not be a problem.

Most people prefer Email or email over EMail. What do you prefer?

xexyl commented 1 year ago

Another example: 1984/anonymous

It compiles but it does nothing. I thought it was supposed to print hello world or some variation. Am I missing something? I've looked at it in more detail before but given the remarks say its would print it the fact it doesn't is kind of problematic.

Also as far as that entry's Makefile:

ENTRY= anonymous
PROG= ${ENTRY}

Is this a problem? I don't remember seeing ENTRY in more recent Makefiles but maybe they are there. It looks (quick glance) that the last year to do this was 2015.

What should it be ?

xexyl commented 1 year ago

Oh one more thing. There are a lot of references to EMail instead of Email or just email. Although the first one was at a time common (iirc) it's not now. We could change it to be either email or e-mail (or if starting a sentence capitalise it). Which do you prefer? Care would have to be taken in some cases because it's also in some code (though not entries) but that should not be a problem.

Most people use Email instead of EMail.

If at beginning of sentence. Else just email.

Is Email / email okay then ?

lcn2 commented 1 year ago

Please update comment 1424725618 with the entry year.

lcn2 commented 1 year ago

Oh one more thing. There are a lot of references to EMail instead of Email or just email. Although the first one was at a time common (iirc) it's not now. We could change it to be either email or e-mail (or if starting a sentence capitalise it).

Which do you prefer? Care would have to be taken in some cases because it's also in some code (though not entries) but that should not be a problem.

Most people use Email instead of EMail.

If at beginning of sentence. Else just email.

Is Email / email okay then ?

Yes.

xexyl commented 1 year ago

Oh one more thing. There are a lot of references to EMail instead of Email or just email. Although the first one was at a time common (iirc) it's not now. We could change it to be either email or e-mail (or if starting a sentence capitalise it).

Which do you prefer? Care would have to be taken in some cases because it's also in some code (though not entries) but that should not be a problem.

Most people use Email instead of EMail.

If at beginning of sentence. Else just email. Is Email / email okay then ?

Yes.

I'll look at doing that tomorrow then. Just as a reminder Saturday I'll have company and I will probably need a bit of recovery time. Same with Tuesday - though both for my birthday Tuesday of course.

xexyl commented 1 year ago

Please update comment 1424725618 with the entry year.

Done (but it's 1992).

xexyl commented 1 year ago

I just came up with a quick command to get a list of all files listed in the years.html file that we can then verify that they exist:

$ grep -oE '<A HREF="[12][0-9]{3}.*[^"]"' years.html |sed -e 's/<A HREF=//g' -e 's/"//g'

Then:

grep -oE '<A HREF="[12][0-9]{3}.*[^"]"' years.html | \
sed -e 's/<A HREF=//g' -e 's/"//g'| \
while read f; do \
    if [[ "$(basename $f)" != "README.md" && "$(basename $f)" != "index.html" ]]; then \
        if [[ ! -f "$f" ]]; then \
            echo "$f is missing"; \
        fi; \
fi; \
done

tells us all the files that are there but are missing. You'll see there are a lot. I have to leave in a moment but I have a question about some of them:

This should be what? index.html ? But then we see that it's already there so maybe it should be deleted from years.html since index.html already is there ?

Remove these lines, right?

As far as the script: I hope it reads okay .. I initially typed it on one line at the shell and only tried to make it read more than one line as it scrolled too far over on GitHub.

I have to leave now but if you answer these questions I can fix it tomorrow morning.

UPDATE 0

Actually it shouldn't skip README.md I guess. I'll fix that. But can't do anything about the output until tomorrow. I'll do it in the morning.

Hope you feel better soon! Get some rest if you can.

UPDATE 1

Fixed that.

A similar approach might be useful to check that files that are in json files exist. Or it might be more simply a matter of:

find . -type f

or so, possibly using the output to create the json files.

xexyl commented 1 year ago

Oh one more thing. There are a lot of references to EMail instead of Email or just email. Although the first one was at a time common (iirc) it's not now. We could change it to be either email or e-mail (or if starting a sentence capitalise it).

Which do you prefer? Care would have to be taken in some cases because it's also in some code (though not entries) but that should not be a problem.

Most people use Email instead of EMail.

If at beginning of sentence. Else just email. Is Email / email okay then ?

Yes.

Fixed in commit e8a3b502115fda4851d9e6494560d49578980e08.