ioccc-src / temp-test-ioccc

Temporary test IOCCC web site that will go away
Creative Commons Attribution Share Alike 4.0 International
28 stars 6 forks source link

Enhancement: build JSON files and HTML files for the web site #4

Closed lcn2 closed 2 months ago

lcn2 commented 1 year ago

Scope - About this issue

Every winning directory needs a JSON file that describes the winning YYYY/dir entry.

NOTE: This writeup is subject to changes and revisions as the understanding of this issue evolves / improves. Check back to this note from time to time. When modification are made to this comment, we will try to add a note indicating that the scope of this issue has been updated.

TODO

Once the manifest is stable, write a tool to generate .inventory.md files to hold links to all files in the winner directory.

NOTE: This issue is now obsolete as the inventory is now at the bottom of the YYYY/dir entry index.html files.

NOTE: See "UPDATE 2a" in comment 1793681758.

NOTE: This issue is now obsolete as the inventory is now at the bottom of the YYYY/dir entry index.html files.

See comment 1893176110.

NOTE: Using the existing 2020/ferguson1/index.html as an example, improve the look and feel of rendered HTML.

NOTE: Improving does not mean making it perfect, just fixing a few the more glaring problems with the look and feel of rendered HTML.

See comment 1894666815.

See issue #1933 .

NOTE: In tmp/manifest.numbers, improve the _inventoryorder, _displayas, _display_viagithub, and _winnerstext fields.

NOTE: Build the other needed files under /tmp after exporting the CSV version (tmp/manifest.csv) of tmp/manifest.numbers

NOTE: Make adjustments to the manifest as per recent file changes.

NOTE: Update all .winner.json files as per the updated manifest.

NOTE: The file is now authors.html and winners.html does a redirect to winners.html.

NOTE: This deals with winner files other than YYYY/dir entry index.html and YYYY/dir entry README.md files.

See Sitemaps on Wikipedia.

NOTE: Add to the top level Makefile, special rules to help drive the process to generate all of the above mentioned HTML files and JSON files for the entire site.

NOTE: This TODO item was moved to issue #2007

Near Final TODO

When all HTML files exist, it is time to change all eferences and links from "foo.md" to "foo.html".

xexyl commented 1 year ago

You know what's REALLY cool about this one?

We can use OUR json parser jparse to validate the files! That's probably the most useful part of it as a tool and not the library - it can validate JSON!

Now I will reply to the rest of the comment next (perhaps in between replying to other comments as well).

xexyl commented 1 year ago

Oh and please assign the rest of the issues to me as I reply to them (since I know I have to reply first).

xexyl commented 1 year ago

Scope - About this issue

Every winning directory needs a JSON file that describes the winning entry.

It does not need an auth.json file or info.json file though, right ?

NOTE: This writeup is subject to changes and revisions as the understanding of this issue evolves / improves. Check back to this note from time to time. When modification are made to this comment, we will try to add a note indicating that the scope of this issue has been updated.

On this note: would you please (this goes with the other issues) let me know if you updated an issue so that I know to look at it lest I do a major change (as these ARE major) and find out it's no longer valid? It's quite easy to get so involved that one does not think about this (and I often don't think about it anyway esp as our threads do get long with lots to discuss - which is good and I bet you think that it would be better if development was more like that ... I do anyway).

.award.json per winner directory

This JSON file will be similar, but not identical to the .info.json file produced for/by the mkiocccentry repo.

So they should be a dot file then?

While some of the JSON data will be copied over, there will be some key differences between .award.json and `.info.json. Foe example, JSON members such as _"entrynum" and "tarball" won't apply. And there will need to be a JSON member:

Do we need a .info.json too then ?

"award" : "name of the award"

That makes sense yes.

The manifest won't have a _"infoJSON", nor a _"authJSON" file. Instead a:

"winner_json" : ".winner.json"

will be present.

You refer to the MANIFEST file in each year's directory, right ?

There will need to be a very important JSON members that contains information about the name of the entry. In addition to:

"IOCCC_contest" : "IOCCCXX",
"IOCCC_year" : YYYY,

each entry will need:

Something tells me that some of this can be automated ... that would be helpful for sure.

"dirname" : "name_of_winner_directory",
"entry_name" : "YYYY_dirname",

For example, for 1984/mullender we might have:

"IOCCC_contest" : "IOCCC01",
"IOCCC_year" : 1984,
"dirname" : "mullender",
"entry_name" : "1984_mullender",

Okay indeed I can think that at least some of this can be generated.

If the author wins more than one entry in a year the dir name would be like usual but what about entry name? Just add the number of the entry (for the winner) to the entry_name ?

BTW: The _"entryname" value will correspond to the _ HTML names found in years.html and used as entry links such as the #name_ in the anchor tag of winners.html.

In other words for 2020 my entries:

which might become:

Right ?

The list of winning authors will for an enter will be given. A JSON member whose value is an array of

"author_list" : [
    "some_author_handle_1",
    "some_author_handle_2",
    "some_author_handle_3"
]

The _"some_author_handleX" JSON strings correspond to the _"authorhandle" files found in the array if authors in the corresponding .auth.json file of the original submission.

I might just be too tired to process this now. What array do you refer to? Which .auth.json? I thought that just lists the winner per entry? Or am I mixing it up with the new way with IOCCC28 and beyond ?

Each winning directory will have one .award.json file.

NOTE: Later on, the .award.json files will be used by the web site builder to build information and links within the web site.

Sounds good.

We recommend that in comments below, a proposed .award.json template be drafted first, so that the overall format may be discussed and considered, before forming a set of actual .award.json files are created.

Good idea to come up with a template.

author JSON file for each person who authored an IOCCC winner

Each person who wrote at least one winning IOCCC entry will have author JSON file. The name of this file will be their authors _"authorhandle" (from some corresponding .auth.json file as submitted) followed by .json.

For example, assume that "Cody Boone Ferguson" has an _"authorhandle" of "ferguson". Then the author JSON file for "Cody Boone Ferguson" would be .".ferguson.json"_.

Now this is for the contest as a whole (more than one year) or just each year? Is it an idea to have one for both?

The contents of the author JSON file will include some of the corresponding "authors" array JSON members from some corresponding .auth.json file.

Okay yes I'm afraid I might just be too tired to parse this right now :( I'll try continuing this but I might have to resume the rest of it later.

Each author JSON file will have a list of one or more awards. There will be a JSON member whose value is a JSON array of values of _"entryname"'s. For example,

"awards" : [
    "2018_ferguson",
    "2020_ferguson1",
    "2020_ferguson2"
],

The will be a top level directory called author_list/ that contain all of the author JSON files.

Above the years I gather?

We recommend that in comments below, a proposed "author JSON file" template be drafted first, so that the overall format may be discussed and considered, before forming a set of actual author JSON files are created.

Good idea with a template here too.

data for the .award.json file and the author JSON files

Data for the .award.json files and the author JSON files will need to be scraped from several sources. One source is the original www.ioccc.org site. Another source is a SQL database that was previously used by a previous IOCCC judge to try and build the original www.ioccc.org site. The SQL database contains, among other things, information about winning entries and authors.

Given a discrepancy between the web site and the SQL database, we recommend assuming that the data in the SQL database is valid.

Here is a zip file of an SQL file containing SQL commands:

IOCCC-data.sql.zip

Here is a zip file pf a SQLite database, presumably built from the above mentioned SQL file:

IOCCC-data.sqlite.zip

Any idea how to deal with this information? I actually wonder if there's a sql to json tool out there but if not any recommended tools to look at it under macOS?

xexyl commented 1 year ago

I've tried to use the sql file and get errors ... was able to fix some but I don't know enough of sql to fix them all.

Going afk for a while .. maybe can work on some of this later but until the issue of not being able to fork this repo (instead of winner) I'm not sure how much good it'll do.

xexyl commented 1 year ago

Given a discrepancy between the web site and the SQL database, we recommend assuming that the data in the SQL database is valid. Here is a zip file of an SQL file containing SQL commands: IOCCC-data.sql.zip Here is a zip file pf a SQLite database, presumably built from the above mentioned SQL file: IOCCC-data.sqlite.zip

Any idea how to deal with this information? I actually wonder if there's a sql to json tool out there but if not any recommended tools to look at it under macOS?

Of course in my then still tired head I was sillily thinking of the database itself which would make no sense at all. It's the commands which I see you also say.

Nevertheless at least in some systems it seems to have syntax errors. It also seems to with converters I found.

I haven't tried the sqlite file though.

I want to do a bit more before my zoom meeting but my hands are so cold that it's difficult to type (I don't get cold that easily generally except for my extremities which are very often cold). Anyway I have a zoom meeting at the hour and shortly after that I'll probably leave for the day. Hopefully tomorrow I can do a bit more and maybe we can work out the problem of not being able to clone this repo.

If you do have to recreate the repo please keep this one here so we can go back for details I the four issues. Of course tomorrow I need to do some other things that I'm not looking forward to but it's for good reasons and they need to be done anyway. Still I should be able to look at some of this and hopefully with a clearer head. Was a pretty rough night.

lcn2 commented 1 year ago

If the SQL files are not directly usable, one my be able to extract the text from the SQL directives in order to build the individual JSON files.

We effectively need to split up the SQL data into individual chunks: some for people who won, some for the each winning entry.

xexyl commented 1 year ago

I think this will be helpful. I also think it might be possible to write a rudimentary C tool that does it.

I will reply to all the comments tomorrow.

lcn2 commented 1 year ago

You know what's REALLY cool about this one?

We can use OUR json parser jparse to validate the files! That's probably the most useful part of it as a tool and not the library - it can validate JSON!

Now I will reply to the rest of the comment next (perhaps in between replying to other comments as well).

That was the idea for the general parser. We will create our own style of chkentry (some other name of course) that performed semantic checks too on these new JSON files.

xexyl commented 1 year ago

You know what's REALLY cool about this one? We can use OUR json parser jparse to validate the files! That's probably the most useful part of it as a tool and not the library - it can validate JSON! Now I will reply to the rest of the comment next (perhaps in between replying to other comments as well).

That was the idea for the general parser. We will create our own style of chkentry (some other name of course) that performed semantic checks too on these new JSON files.

Makes sense (the first part) - and I recall something like this as well. And the second part sounds like a good idea too.

As for taking the sql file and turning it into json files I have some ideas. Mostly a hack (or perhaps hacks depending on the number of ways).

I'm afraid I still am not clear on everything though so I won't really be doing anything of it today. Even so I hope I can do more tomorrow.

I think I finally have the other issue resolved - until the domains have transferred.

Tomorrow I'll be a bit busy part of the day too - with things that can't be avoided but again for good reasons.

Going for the other threads now.

xexyl commented 1 year ago

Done with that .. now as far as templates for the json files.

Any ideas what each one should look like?

As for the hack I thought of that might allow for generating some of the data from the sql file it would kind of just read the line by pattern (insert into ...) and then parse the data to insert. That part might be more problematic though because json types are different from sql.

Database news

I'm not sure what to do about it. Do we need a json file for it? I guess not?

Database person

The fact I added an alt url for each author means that we probably need another member presumably alt_url but where that belongs I don't know - I guess the .auth.json file. But that doesn't resolve the fact that we need a template for each one. The persons database would of course end up being an array of authors - unless it's meant to be for each entry and not an entire list? What is needed?

Database mirrors

The mirrors it's suggested you don't do that now. But:

INSERT INTO mirrors VALUES(7,'Extraterrestrial','<A HREF="http://www.seti.org">SETI</A> is looking for some sites :-)','','',NULL,NULL,NULL,NULL,1);

is hilarious! Though I am very against trying to find life on other planets as we would just destroy it - just as we would destroy other planets if we went there and I don't think we have that right (not that that would ever stop humans). Still it's a hilarious thought so thank you for the laugh there!

Database judges

The judges one is easy enough once we know what information to include but I'm not sure what file it should be in nor am I sure if we want specific information not there (as noted). Obviously I would have to only have you and Leo.

Database year

As far as years go I guess it would be an array of something like (with the names of the members chosen arbitrarily):

"year": 2020,
"contest_number": 27
"name" : "...",

Or should the contest number be a string for e.g. "27th" ?

Database winners

As far as winners I guess this is different from the persons database in that it just is a list of winners and not their information. Is that right?

What about the fact that anonymous authorship is no longer allowed? I guess the previous anonymous will just be called "anonymous" ?

Database prizes

As for prizes I guess it's also an array of the different information but it seems like it might be for the list of winning entries - not a yearly basis (even if the year is there). Is that right?

Database resource

As for resource: it seems it'll also have to be an array. It seems like it's a more complicated manifest in the .info.json files.

Database alphabet_headers

This is an easy one of course.

General questions / thoughts

Should any not be used?

Should any be merged?

What to do about dead links in any of these? In some cases we might use the Internet Wayback Machine but what if there weren't any captures ?

What format / layout (ordering) do we want?

And the templates: what should each look like? I guess that's the most important thing.

Aside

You might have noticed I went from bottom to top .. I've been known also to read articles backwards with improved reading comprehension. No I can't explain that one. I also read sideways, backwards, diagonally, mirrored and other things as well as being able to (though usually I am too tired to do it) read in blocks and scan pages very very quickly finding information that I need (even if I don't know exactly what I'm looking for). I also open books to where I left off - or where the information is that I'm after. Amongst other things :-) .. definitely some gifts but gifts I love.

... going to do other things now. I can reply later depending on when you reply. Else tomorrow.

I hope you got some sleep! I did leave a quick pull request but it unfortunately went to winner.git even though I tried your idea that I thought would actually work.

Good day!

lcn2 commented 1 year ago

See comment 1421609887 for a suggestion on making progress on this issue.

lcn2 commented 1 year ago

Every winning directory needs a JSON file that describes the winning entry. It does not need an auth.json file or info.json file though, right ?

The .auth.json and the .info.json files from a new IOCCC submission will not appear in the final winner directory. Certainly information from those submitted files will be used in generating JSON files for the website, but those particular files will not be used. Moreover, the JSON files this issue is talking about will be slightly different in format. They will have some new values and will omit some of the values that were in the submitted entry JSON files.

lcn2 commented 1 year ago

So they should be a dot file then?

The reason why we want to call the per-directory JSON file something that starts with a "." (such as perhaps .winner.json - we are not firm on the exact filename, BTW, just that it will start with ".") is that submitted entries are not allowed to have files that begin with "." other than the JSON files that mkiocccentry creates.

lcn2 commented 1 year ago

You refer to the MANIFEST file in each year's directory, right ?

No, that file will likely go away. We refer to some elements in the new per directories JSON file that will contain a manifest similar to how .info.json does for the submitted entries.

lcn2 commented 1 year ago

If the author wins more than one entry in a year the dir name would be like usual but what about entry name? Just add the number of the entry (for the winner) to the entry_name ?

Yes.

lcn2 commented 1 year ago

In other words for 2020 my entries:

ferguson1 ferguson2 which might become:

ferguson1

ferguson2

Right ?

Correct.

lcn2 commented 1 year ago

Now this is for the contest as a whole (more than one year) or just each year? Is it an idea to have one for both?

The author handle will be for the contest as a whole.

It is why we ask people submitting new entries if they've won before and if so, what is their handle. This will help us match up a new winning entry with somebody who is previously one so we can add their new win into the list of their previous winners.

lcn2 commented 1 year ago

Any idea how to deal with this information? I actually wonder if there's a sql to json tool out there but if not any recommended tools to look at it under macOS?

We don't know. If we were to have to do it, we would take the SQL commands and use an editor to turn the appropriate text into stuff we want for JSON file.

There may be a better way to do this and you're attempting to do something with the SQL commands or the SQL database might be more useful. It depends on how comfortable you are in working with SQL database.

lcn2 commented 1 year ago

We believe we have addressed all of the current questions that still need answering at this time. If we've missed something or something else needs to be clarified, please ask again.

xexyl commented 1 year ago

See comment 1421609887 for a suggestion on making progress on this issue.

I'm not yet clear on how that comment will help this one unless you mean that solving this issue will simplify that one. Mind clarifying?

xexyl commented 1 year ago

Every winning directory needs a JSON file that describes the winning entry. It does not need an auth.json file or info.json file though, right ?

The .auth.json and the .info.json files from a new IOCCC submission will not appear in the final winner directory. Certainly information from those submitted files will be used in generating JSON files for the website, but those particular files will not be used. Moreover, the JSON files this issue is talking about will be slightly different in format. They will have some new values and will omit some of the values that were in the submitted entry JSON files.

Okay. So I guess we should figure out what they should contain then (which I guess is part of the template suggestion).

xexyl commented 1 year ago

So they should be a dot file then?

The reason why we want to call the per-directory JSON file something that starts with a "." (such as perhaps .winner.json - we are not firm on the exact filename, BTW, just that it will start with ".") is that submitted entries are not allowed to have files that begin with "." other than the JSON files that mkiocccentry creates.

Right. That makes sense. Thanks for the reminder.

Does this apply to all json files? And will there be any other dot files ?

xexyl commented 1 year ago

You refer to the MANIFEST file in each year's directory, right ?

No, that file will likely go away. We refer to some elements in the new per directories JSON file that will contain a manifest similar to how .info.json does for the submitted entries.

Ah. Well that's good that there aren't that many then. Also that might change some of at least one of my comments in another thread.

xexyl commented 1 year ago

If the author wins more than one entry in a year the dir name would be like usual but what about entry name? Just add the number of the entry (for the winner) to the entry_name ?

Yes.

Okay. Though of course I no longer know the context entirely. I'll look at it another time. Right now trying to get through comments and that (besides the commits I made earlier on) is probably all I'll do today here. Was a pretty rough night.

xexyl commented 1 year ago

Now this is for the contest as a whole (more than one year) or just each year? Is it an idea to have one for both?

The author handle will be for the contest as a whole.

It is why we ask people submitting new entries if they've won before and if so, what is their handle. This will help us match up a new winning entry with somebody who is previously one so we can add their new win into the list of their previous winners.

The author_handle and past_winner then.

xexyl commented 1 year ago

Any idea how to deal with this information? I actually wonder if there's a sql to json tool out there but if not any recommended tools to look at it under macOS?

We don't know. If we were to have to do it, we would take the SQL commands and use an editor to turn the appropriate text into stuff we want for JSON file.

There may be a better way to do this and you're attempting to do something with the SQL commands or the SQL database might be more useful. It depends on how comfortable you are in working with SQL database.

Well the getting it to import into a database did not work. It might if I knew more of sql but I have a loathing for it and since I have never really had to use it I try avoiding it. The syntax is horrid (though I guess compared to some other languages like php ....).

I'm not sure if it's feasible to parse the sql commands or not. What kind of thoughts did you have with using vim (I guess that's what you mean) to turn the appropriate text into a json file ?

But of course before this can be done we need a format so that has to be decided first. That might make it easier. Then again it might make it harder but we won't know until it's been decided (which of course we need to do).

lcn2 commented 1 year ago

Any idea how to deal with this information? I actually wonder if there's a sql to json tool out there but if not any recommended tools to look at it under macOS?

We don't know. If we were to have to do it, we would take the SQL commands and use an editor to turn the appropriate text into stuff we want for JSON file.

There may be a better way to do this and you're attempting to do something with the SQL commands or the SQL database might be more useful. It depends on how comfortable you are in working with SQL database.

Well the getting it to import into a database did not work. It might if I knew more of sql but I have a loathing for it and since I have never really had to use it I try avoiding it. The syntax is horrid (though I guess compared to some other languages like php ....).

I'm not sure if it's feasible to parse the sql commands or not. What kind of thoughts did you have with using vim (I guess that's what you mean) to turn the appropriate text into a json file ?

But of course before this can be done we need a format so that has to be decided first. That might make it easier. Then again it might make it harder but we won't know until it's been decided (which of course we need to do).

Probably the most valuable thing in the S cool commands are information about authors names, email addresses, etc. When might have to figure out the SQL command syntax and form the data by using vim.

xexyl commented 1 year ago

Any idea how to deal with this information? I actually wonder if there's a sql to json tool out there but if not any recommended tools to look at it under macOS?

We don't know. If we were to have to do it, we would take the SQL commands and use an editor to turn the appropriate text into stuff we want for JSON file.

There may be a better way to do this and you're attempting to do something with the SQL commands or the SQL database might be more useful. It depends on how comfortable you are in working with SQL database.

Well the getting it to import into a database did not work. It might if I knew more of sql but I have a loathing for it and since I have never really had to use it I try avoiding it. The syntax is horrid (though I guess compared to some other languages like php ....). I'm not sure if it's feasible to parse the sql commands or not. What kind of thoughts did you have with using vim (I guess that's what you mean) to turn the appropriate text into a json file ? But of course before this can be done we need a format so that has to be decided first. That might make it easier. Then again it might make it harder but we won't know until it's been decided (which of course we need to do).

Probably the most valuable thing in the S cool commands are information about authors names, email addresses, etc. When might have to figure out the SQL command syntax and form the data by using vim.

Makes sense ... though maybe it would be possible to do something with awk? I'm not sure. Problem is the different types of data I would guess.

Anyway yes if you could generate starting json files like you suggested that would be a great help - but of course again we have to come up with a format first.

lcn2 commented 1 year ago

Makes sense ... though maybe it would be possible to do something with awk? I'm not sure. Problem is the different types of data I would guess.

Sure on your using awk on the SQL commands.

BTW: The most important information in those SQL commands is data about the winning authors and what the authors won.

Anyway yes if you could generate starting json files like you suggested that would be a great help - but of course again we have to come up with a format first.

We will do that for the .winner.json files for each entry directory.

Please work on collecting information about who won (name, Email address, country, association, etc.) and the entires they won.

xexyl commented 1 year ago

Makes sense ... though maybe it would be possible to do something with awk? I'm not sure. Problem is the different types of data I would guess.

Sure on your using awk on the SQL commands.

Well I'm not sure how easy it would be unless the operator is , and we simply print it all in strings (and we can then just say through sed update number strings to be numbers, for example). That's just a loose thought.

A crazy thought popped into my head - using flex would also work well but might be overkill for a one-time use.

BTW: The most important information in those SQL commands is data about the winning authors and what the authors won.

Thanks.

Anyway yes if you could generate starting json files like you suggested that would be a great help - but of course again we have to come up with a format first.

We will do that for the .winner.json files for each entry directory.

Thanks.

Please work on collecting information about who won (name, Email address, country, association, etc.) and the entires they won.

In what format does it need to be?

lcn2 commented 1 year ago

Makes sense ... though maybe it would be possible to do something with awk? I'm not sure. Problem is the different types of data I would guess.

Sure on your using awk on the SQL commands.

Well I'm not sure how easy it would be unless the operator is , and we simply print it all in strings (and we can then just say through sed update number strings to be numbers, for example). That's just a loose thought.

A crazy thought popped into my head - using flex would also work well but might be overkill for a one-time use.

BTW: The most important information in those SQL commands is data about the winning authors and what the authors won.

Thanks.

Anyway yes if you could generate starting json files like you suggested that would be a great help - but of course again we have to come up with a format first.

We will do that for the .winner.json files for each entry directory.

Thanks.

Please work on collecting information about who won (name, Email address, country, association, etc.) and the entires they won.

In what format does it need to be?

A JSON file under the name of "author handle".json. There will be a top level authors/ directly that will contain all such JSON files.

BTW: This is why we formed POSIX-safe author handles in .auth.json files and why mkiocccentry asked if an author is a previous winner.

BTW: There will be a way for someone to determine their author handle if they won before.

The format for such winning author JSON files is TBD, but one ca n guess that they will draw on parts of the auth.json format.

For now, try to extract any information you can about people who won the IOCCC from the SQL files.

xexyl commented 1 year ago

Makes sense ... though maybe it would be possible to do something with awk? I'm not sure. Problem is the different types of data I would guess.

Sure on your using awk on the SQL commands.

Well I'm not sure how easy it would be unless the operator is , and we simply print it all in strings (and we can then just say through sed update number strings to be numbers, for example). That's just a loose thought. A crazy thought popped into my head - using flex would also work well but might be overkill for a one-time use.

BTW: The most important information in those SQL commands is data about the winning authors and what the authors won.

Thanks.

Anyway yes if you could generate starting json files like you suggested that would be a great help - but of course again we have to come up with a format first.

We will do that for the .winner.json files for each entry directory.

Thanks.

Please work on collecting information about who won (name, Email address, country, association, etc.) and the entires they won.

In what format does it need to be?

A JSON file under the name of "author handle".json. There will be a top level authors/ directly that will contain all such JSON files.

How will I know the author handle? I guess it's like the summary file but removing the numbers when they have more than one winning entry in a year. What about the fact that there are probably more than one anonymous author of the past ?

BTW: This is why we formed POSIX-safe author handles in .auth.json files and why mkiocccentry asked if an author is a previous winner.

Of course and also for storing it on disk!

BTW: There will be a way for someone to determine their author handle if they won before.

Yes and that's actually a TODO item in mkiocccentry.c.

The format for such winning author JSON files is TBD, but one ca n guess that they will draw on parts of the auth.json format.

Sounds good.

For now, try to extract any information you can about people who won the IOCCC from the SQL files.

I'll see what I can do. I'm hoping to finish something else up first but it's proving to be annoying in another way so I might not. Whether I'll do the sql thing now or not I don't know. Maybe I can look at it. I have to find the file again and then extract the file so I can look at the sql file.

xexyl commented 1 year ago

Which tables are important?


$ grep 'TABLE' ~/Downloads/IOCCC-data.sql
CREATE TABLE alphabet_headers (
CREATE TABLE resource (
CREATE TABLE prizes (
CREATE TABLE winners (
CREATE TABLE years (
CREATE TABLE judges (
CREATE TABLE mirrors (
CREATE TABLE person (
CREATE TABLE news (

Or is it better to ask which tables should be ignored?

UPDATE 0

person is important and I believe winners might be also but what I'm getting at is for this specific request you have in comment https://github.com/ioccc-src/temp-test-ioccc/issues/4#issuecomment-1424457231 which ones do you think are important?

xexyl commented 1 year ago

Also what to do about dead links in all of what we're doing ?

lcn2 commented 1 year ago

So they should be a dot file then?

The reason why we want to call the per-directory JSON file something that starts with a "." (such as perhaps .winner.json - we are not firm on the exact filename, BTW, just that it will start with ".") is that submitted entries are not allowed to have files that begin with "." other than the JSON files that mkiocccentry creates.

Right. That makes sense. Thanks for the reminder.

Does this apply to all json files?

The authors/an.author.handle.json files won't start with dot.

And will there be any other dot files ?

There night not be any other such files, but we have reserved such ".filenames" in case we do need to do so. And there might be a need for a directory that starts with "." .. but that is TBD.

xexyl commented 1 year ago

So they should be a dot file then?

The reason why we want to call the per-directory JSON file something that starts with a "." (such as perhaps .winner.json - we are not firm on the exact filename, BTW, just that it will start with ".") is that submitted entries are not allowed to have files that begin with "." other than the JSON files that mkiocccentry creates.

Right. That makes sense. Thanks for the reminder. Does this apply to all json files?

The authors/an.author.handle.json files won't start with dot.

And will there be any other dot files ?

There night not be any other such files, but we have reserved such ".filenames" in case we do need to do so. And there might be a need for a directory that starts with "." .. but that is TBD.

This all makes sense. Thanks for clarifying!

lcn2 commented 1 year ago

Also what to do about dead links in all of what we're doing ?

Those will have to be fixed.

xexyl commented 1 year ago

Also what to do about dead links in all of what we're doing ?

Those will have to be fixed.

Of course but define 'fixed'.

Removed? I guess the real answer is try finding it on the Internet Wayback Machine but if it fails then remove it. Is that right?

lcn2 commented 1 year ago

Which tables are important?


$ grep 'TABLE' ~/Downloads/IOCCC-data.sql

CREATE TABLE alphabet_headers (

CREATE TABLE resource (

CREATE TABLE prizes (

CREATE TABLE winners (

CREATE TABLE years (

CREATE TABLE judges (

CREATE TABLE mirrors (

CREATE TABLE person (

CREATE TABLE news (

Or is it better to ask which tables should be ignored?

UPDATE 0

person is important and I believe winners might be also but what I'm getting at is for this specific request you have in comment https://github.com/ioccc-src/temp-test-ioccc/issues/4#issuecomment-1424457231 which ones do you think are important?

The SQL data is a bit beyond our reach (due to 🚽🧻), so we cannot advise on specifics.

As we understand, the intent was to create a SQLite database and a tool that generated all of the web page links.

Within that SQL command set is the setting of all of the author information. You might look for your own data and see where it ends up. This is way we suggested exploring the SQL command file with vim.

xexyl commented 1 year ago

Which tables are important?

$ grep 'TABLE' ~/Downloads/IOCCC-data.sql

CREATE TABLE alphabet_headers (

CREATE TABLE resource (

CREATE TABLE prizes (

CREATE TABLE winners (

CREATE TABLE years (

CREATE TABLE judges (

CREATE TABLE mirrors (

CREATE TABLE person (

CREATE TABLE news (

Or is it better to ask which tables should be ignored?

UPDATE 0

person is important and I believe winners might be also but what I'm getting at is for this specific request you have in comment #4 (comment) which ones do you think are important?

The SQL data is a bit beyond our reach (due to 🚽🧻), so we cannot advise on specifics.

As we understand, the intent was to create a SQLite database and a tool that generated all of the web page links.

Within that SQL command set is the setting of all of the author information. You might look for your own data and see where it ends up. This is way we suggested exploring the SQL command file with vim.

I did indeed look at it in vim .. and I have something that might be useful. I just made each table a separate sql file - without the COMMIT; line.

I did not look at the sqlite file though maybe I should have. I know less of it than I do sql (which is little .. thankfully).

Anyway for when you're able to look at the computer this might be of use to you too as it'll make it easier to extract individual types of data and put it into the correct file.

ioccc.sql.zip

Of course one might want to make sure that nothing is removed by accident!

xexyl commented 1 year ago

Now as far as specific tables.

It seems that resource.sql will be useful for any manifests of each entry.

It seems that person.sql will be useful for the author json files.

It seems that prizes.sql will be useful for reward titles (also includes the author name, year and some other information).

It seems that years.sql might be useful for years.html.

List of all the files including the original:

IOCCC-data.sql
alphabet.sql
judges.sql
mirrors.sql
news.sql
person.sql
prizes.sql
resource.sql
winners.sql
years.sql

Unfortunately my hands are cold enough where it's making it hard to type so I'm going to stop for now. Hope you feel better soon!

Oh and I removed the other judge from the judges table - but possibly not in the original file .. not sure.

lcn2 commented 1 year ago

How will I know the author handle? I guess it's like the summary file but removing the numbers when they have more than one winning entry in a year. What about the fact that there are probably more than one anonymous author of the past ?

The author handle can be seen in the top level winner.html file. Look at the HTML of your entry:

<UL>
<LI TYPE=none><A NAME="Cody_Ferguson"></A><B><a href="&#109;&#97;&#105;&#x6C;&#116;&#x6F;&#x3A;&#105;&#111;&#99;&#x63;&#x63;@&#120;&#101;&#120;&#121;l&#x2E;&#x6E;&#x65;&#x74;">Cody Boone Ferguson</a></B> -- <A HREF="https://ioccc.xexyl.net">https://ioccc.xexyl.net</a>
<UL>
<LI TYPE=square>Best use of weasel words (<A HREF="years.html#2018_ferguson">2018 ferguson</A>)
<LI TYPE=square>Most enigmatic (<A HREF="years.html#2020_ferguson2">2020 ferguson2</A>)
<LI TYPE=square>Don't tread on me award (<A HREF="years.html#2020_ferguson1">2020 ferguson1</A>)
</UL><BR>

See the "Cody_Ferguson" HTML NAME? That is your winner handle. So there would be a winner/Cody_Ferguson.json file with information about YOU as well a links to winning entires.

The strings of the form "years.html#2020_ferguson2" refer to name tag in the top level years.html file. And thus will refer to the 2020/ferguson2/ winning entry directory (a rather fine entry if you ask us 🤓). So the file 2020/ferguson2/.winner.json will contain things like the manifest for that particular entry as well as JSON stuff for the list if authors .. which in this case will be "Cody_Ferguson".

Hopefully you begin to see how this all ties together.

lcn2 commented 1 year ago

Also what to do about dead links in all of what we're doing ?

Those will have to be fixed.

Of course but define 'fixed'.

Removed? I guess the real answer is try finding it on the Internet Wayback Machine but if it fails then remove it. Is that right?

Well if the file is gone, the link to such a file will need to be deleted.

If the file was renamed, the link will need to be modified.

We refer to links in the top level years.html web page, BTW.

xexyl commented 1 year ago

How will I know the author handle? I guess it's like the summary file but removing the numbers when they have more than one winning entry in a year. What about the fact that there are probably more than one anonymous author of the past ?

The author handle can be seen in the top level winner.html file. Look at the HTML of your entry:

<UL>
<LI TYPE=none><A NAME="Cody_Ferguson"></A><B><a href="&#109;&#97;&#105;&#x6C;&#116;&#x6F;&#x3A;&#105;&#111;&#99;&#x63;&#x63;@&#120;&#101;&#120;&#121;l&#x2E;&#x6E;&#x65;&#x74;">Cody Boone Ferguson</a></B> -- <A HREF="https://ioccc.xexyl.net">https://ioccc.xexyl.net</a>
<UL>
<LI TYPE=square>Best use of weasel words (<A HREF="years.html#2018_ferguson">2018 ferguson</A>)
<LI TYPE=square>Most enigmatic (<A HREF="years.html#2020_ferguson2">2020 ferguson2</A>)
<LI TYPE=square>Don't tread on me award (<A HREF="years.html#2020_ferguson1">2020 ferguson1</A>)
</UL><BR>

See the "Cody_Ferguson" HTML NAME? That is your winner handle. So there would be a winner/Cody_Ferguson.json file with information about YOU as well a links to winning entires.

Oh I thought the winner handle would be the directory name for some reason though it makes sense to have the full name esp if more than one person has the same surname. However I want to change my winner handle then. Want my middle name in there as well. I can do that sometime soon.

The strings of the form "years.html#2020_ferguson2" refer to name tag in the top level years.html file. And thus will refer to the 2020/ferguson2/ winning entry directory (a rather fine entry if you ask us 🤓). So the file 2020/ferguson2/.winner.json will contain things like the manifest for that particular entry as well as JSON stuff for the list if authors .. which in this case will be "Cody_Ferguson".

Right. Anchor tag.

Will that file also include a link to the entry? That seems like a good idea to me but maybe you have another idea in mind.

Hopefully you begin to see how this all ties together.

Yes and no. Yes in that I see how it's useful but no in that I'm not sue of how having the json files will be of help - yet. I guess it depends on the tools you have in mind to make as well as the format we come up with.

lcn2 commented 1 year ago

Oh I thought the winner handle would be the directory name for some reason though it makes sense to have the full name esp if more than one person has the same surname. However I want to change my winner handle then. Want my middle name in there as well. I can do that sometime soon.

Eventually, by moving winner/Cody_Ferguson.json to another file in the winner/ directory AND by modifying the "winning author information" in the 3 .winner.json files in the 3 winning directories, that can easily happen.

xexyl commented 1 year ago

Oh I thought the winner handle would be the directory name for some reason though it makes sense to have the full name esp if more than one person has the same surname. However I want to change my winner handle then. Want my middle name in there as well. I can do that sometime soon.

Eventually, by moving winner/Cody_Ferguson.json to another file in the winner/ directory AND by modifying the "winning author information" in the 3 .winner.json files in the 3 winning directories, that can easily happen.

And hopefully >3 files! Though I have to say that what I've done for the contest kind of surpasses the winning entries as you can probably imagine.

But I don't mind what the handle is .. I just was thinking it was. It's good that you pointed it out because I would have changed the default handle to be 'ferguson'. That does bring up a point though.

The default handle in the json files is made all lower case. Should that happen with the html file too ?

Speaking of such: I updated my winner handle in commit cc677fc6be1047d0f1e7c6fe89fdc91fe065039f.

lcn2 commented 1 year ago

Oh I thought the winner handle would be the directory name for some reason though it makes sense to have the full name esp if more than one person has the same surname. However I want to change my winner handle then. Want my middle name in there as well. I can do that sometime soon.

Eventually, by moving winner/Cody_Ferguson.json to another file in the winner/ directory AND by modifying the "winning author information" in the 3 .winner.json files in the 3 winning directories, that can easily happen.

UPDATE 0

The tool that checks on the contents the .winner.json files (one that will be an "analog" of the chkentry JSON semantic checker tool) will need to also access the correspond winner handle related files that live under winner/some_winner_handle_name.json .. opening them up and calling the jparse function on those files and then inspecting such files for a reference back to the winning directory.

This is why it was important that the JSON parser be callable multiple times on multiple files within a single program.

xexyl commented 1 year ago

Oh I thought the winner handle would be the directory name for some reason though it makes sense to have the full name esp if more than one person has the same surname. However I want to change my winner handle then. Want my middle name in there as well. I can do that sometime soon.

Eventually, by moving winner/Cody_Ferguson.json to another file in the winner/ directory AND by modifying the "winning author information" in the 3 .winner.json files in the 3 winning directories, that can easily happen.

UPDATE 0

The tool that checks on the contents the .winner.json files (one that will be an "analog" of the chkentry JSON semantic checker tool) will need to also access the correspond winner handle related files that live under winner/some_winner_handle_name.json .. opening them up and calling the jparse function on those files and then inspecting such files for a reference back to the winning directory.

This is why it was important that the JSON parser be callable multiple times on multiple files within a single program.

So the obvious question is: do they have to run at the same time or in sequence?

The parser/scanner is re-entrant but it might be nice if we had a good way to test that it works out well.

UPDATE 0

Ah .. reread it. Okay so that re-entrancy is not important. But still would be nice if we could come up with a test case.

xexyl commented 1 year ago

Leaving for the day but before I do a couple notes.

In comment https://github.com/ioccc-src/temp-test-ioccc/issues/3#issuecomment-1424875758 I generated a list of files that are listed in the years.html file that do not exist (didn't include the output .. just included the simple script I wrote that does it, originally as a long one liner). This comment has some questions for you on some files. I think I know what needs to be done but before I do those changes I am asking. That comment also gave a possible way to generate a list of files in json format which is why I've noted it here too.

Leaving for the day ... hope you feel better soon! Good day and good night! Hope you get a better rest tonight.

lcn2 commented 1 year ago

Done with that .. now as far as templates for the json files.

Any ideas what each one should look like?

As for the hack I thought of that might allow for generating some of the data from the sql file it would kind of just read the line by pattern (insert into ...) and then parse the data to insert. That part might be more problematic though because json types are different from sql.

Database news

I'm not sure what to do about it. Do we need a json file for it? I guess not?

We do not need one.

Database person

The fact I added an alt url for each author means that we probably need another member presumably alt_url but where that belongs I don't know - I guess the .auth.json file. But that doesn't resolve the fact that we need a template for each one. The persons database would of course end up being an array of authors - unless it's meant to be for each entry and not an entire list? What is needed?

There will be no general database. Just a directory of JSON files, one per author.

Database mirrors

The mirrors it's suggested you don't do that now. But:

INSERT INTO mirrors VALUES(7,'Extraterrestrial','<A HREF="http://www.seti.org">SETI</A> is looking for some sites :-)','','',NULL,NULL,NULL,NULL,1);

is hilarious! Though I am very against trying to find life on other planets as we would just destroy it - just as we would destroy other planets if we went there and I don't think we have that right (not that that would ever stop humans). Still it's a hilarious thought so thank you for the laugh there!

:-)

Database judges

The judges one is easy enough once we know what information to include but I'm not sure what file it should be in nor am I sure if we want specific information not there (as noted). Obviously I would have to only have you and Leo.

Not useful.

Database year

As far as years go I guess it would be an array of something like (with the names of the members chosen arbitrarily):

"year": 2020,
"contest_number": 27
"name" : "...",

Or should the contest number be a string for e.g. "27th" ?

Just 27.

Database winners

As far as winners I guess this is different from the persons database in that it just is a list of winners and not their information. Is that right?

What about the fact that anonymous authorship is no longer allowed? I guess the previous anonymous will just be called "anonymous" ?

Yes, anonymous.

Database prizes

As for prizes I guess it's also an array of the different information but it seems like it might be for the list of winning entries - not a yearly basis (even if the year is there). Is that right?

The "prize" is a aspect of the individual .winner.json files in each winning entry directory.

Database resource

As for resource: it seems it'll also have to be an array. It seems like it's a more complicated manifest in the .info.json files.

Database alphabet_headers

This is an easy one of course.

General questions / thoughts

Should any not be used?

Should any be merged?

What to do about dead links in any of these? In some cases we might use the Internet Wayback Machine but what if there weren't any captures ?

What format / layout (ordering) do we want?

And the templates: what should each look like? I guess that's the most important thing.

Not useful info.

Aside

You might have noticed I went from bottom to top .. I've been known also to read articles backwards with improved reading comprehension. No I can't explain that one. I also read sideways, backwards, diagonally, mirrored and other things as well as being able to (though usually I am too tired to do it) read in blocks and scan pages very very quickly finding information that I need (even if I don't know exactly what I'm looking for). I also open books to where I left off - or where the information is that I'm after. Amongst other things :-) .. definitely some gifts but gifts I love.

... going to do other things now. I can reply later depending on when you reply. Else tomorrow.

I hope you got some sleep! I did leave a quick pull request but it unfortunately went to winner.git even though I tried your idea that I thought would actually work.

Good day!

See comment 1427202750.