ioccc-src / mkiocccentry

Form an IOCCC entry as a compressed tarball file
Other
28 stars 5 forks source link

Enhancement: write JSON utilities for jfmt, jval and jnamval #523

Open lcn2 opened 11 months ago

lcn2 commented 11 months ago

TODO

other TODOs

Pre-implementation

TODOs not in the json_util_README.md writeup

Finally

See json_util_README.md for information about the JSON utilities for jparse.

xexyl commented 11 months ago

Please assign this to me and I will hopefully be able to look at it much more thoroughly tomorrow.

I am afraid that right now it is beyond my faculties. Been awake too long with too little sleep.

But hopefully tomorrow I will be clearer. I suspect that we will have to discuss some of this but we shall see.

xexyl commented 11 months ago

There's a lot to process with this issue and right now I'm not awake enough yet but I do hope to be later on. I do want to reply to one part though:

We believe that once the "temp-test-ioccc" has put the JSON parser thru its paces via tools that will use the JSON parser code, then it will be a good time to form the jparse repo.

I like this a lot! Sounds good.

I'm going to take a break now (I think) and later on I'll look at reading this and then replying. I'm not sure how to best reply yet: it might be it's necessary to reply in parts just to help us gather our thoughts as it is a long comment.

xexyl commented 11 months ago

With commit https://github.com/ioccc-src/mkiocccentry/pull/524/commits/2f9c0fb5a5df6ffb9144beccffb53852ad96c8a3 I created the initial C and header files and updated the Makefile to compile it. All it does is exit(0) for now but this way we don't have to worry about creating the files and updating the Makefile when it's time to start writing code.

I'm taking a break now and I hope to return to this issue later today ... hope you're sleeping well! I'm doubting that I'll be able to go back to sleep but I'm going to at least rest a bit.

Oh and please assign this to me when you get a chance.

xexyl commented 11 months ago

I started to write the command line parser - or rather part of it - but I noticed we should come up with a usage message first. I actually did think of this but I felt like it might be developed whilst working on the command line options. Then we could worry about the next step (or is that NeXTSTEP ? :-) ) and only after that we could write the actual code.

I also think this will be a joint effort so I changed the comment to say this is being co-developed by both of us.

However since I see the part about coming up with the usage message first I have not committed. I might do just to get it out of the way but it would be incomplete.

At this point though I am too tired to fully process the possible options and some are not clear. But it might be that it's more clear once I've read the OP in full. To start out though (I'll make a new comment for additions) am I right in saying that:

I'm not sure what the key option -k is for exactly versus the arg or args after the file name. I haven't looked at anything else but I hope this is a good start.

I'm afraid I probably won't get much more done here today but it depends on when you get back and also if I'm even awake .. finding myself having to sleep in the day lately and I leave pretty early too nowadays (as in shut the laptop down .. go to bed several hours later). Still I do hope to at least read the OP here and I also hope that we can get the discussion rolling!

xexyl commented 11 months ago

With commit https://github.com/ioccc-src/mkiocccentry/pull/525/commits/1f360485c334fc7bf9f8340c5d70a00d3f9d0f17 the standard option parsing is out of the way so that WE CAN NOW FOCUS ON THE TOOL ITSELF. Once we have discussed this thoroughly then we can start adding code for the tool!

xexyl commented 11 months ago

I've not forgotten this but I think I will have to return to it tomorrow. I do hope that the above comments and the two commits are a good start. I agree that we should come up with a usage string and doing that might give us more ideas too. It was a pretty difficult night and an early morning and I need to do other things now.

There's a lot of good material in your comment and I want to give it the attention it deserves and I don't think I can do that today. Sorry but I think it'll be better that way and I guess you'd agree too. Even so I hope we can start the conversation on usage string.

xexyl commented 11 months ago

Okay I had a chance to look at it quickly but not thoroughly. I believe that the below is enough to start the conversation but I'll have to continue tomorrow as I'll be leaving soon. I hope this is of help!

usage: jprint [-p] [-j] [-t width] [-k] [-p] [-Q] [-e] [-v] [-J] [-f type[,type,...]] [-c count_spec] [-C] [-V]
              [-L] [-l level]
    -j      print a JSON syntax values (what is a syntax value as such ?)
    -t      specify tab width
    -k      print key or value (?? how is it both? What is the idea here?)
    -p      print contents of file if valid
    -Q      print surrounding '"' in strings
    -e      print string as a JSON encoded string
    -q      quiet mode
    -v      verbosity level
    -J      JSON verbosity level
    -f      comma separated list of JSON data types to print
    -c      see man page for more details (for now)
    -C      compound (?? is this correct? What do you have in mind here?)
    -L      print depth/levels of data
    -l      only print data if at specified level (or levels ?)
    -V      print version and exit

-J conflicts with your idea but it might be better to choose something else for yours since this way the -J is consistent with what we have elsewhere?

You addressed my immediate concern when the same field is repeated but what about arrays? How will arrays be dealt with?

I put questions in the usage string too that might want to be thought about. Some options not listed by you were added but they might or might not be of value. If not they can be removed or modified or whatever is needed.

What do you think ?

lcn2 commented 11 months ago

Mini holiday here .. we hope to return today sometime to answer questions.

xexyl commented 11 months ago

Mini holiday here .. we hope to return today sometime to answer questions.

I should hope you'll be back! Hope the holiday is going well ... have fun and stay safe!

lcn2 commented 11 months ago

Mini holiday here .. we hope to return today sometime to answer questions.

I should hope you'll be back! Hope the holiday is going well ... have fun and stay safe!

Sorry, we have extended the holiday weekend: traveled early. Will get back to these topics.

xexyl commented 11 months ago

Mini holiday here .. we hope to return today sometime to answer questions.

I should hope you'll be back! Hope the holiday is going well ... have fun and stay safe!

Sorry, we have extended the holiday weekend: traveled early. Will get back to these topics.

No worries. I am dealing with some other problems and I am also very tired. I won't be able to do more here today anyway.

Best wishes.

lcn2 commented 11 months ago

We are "clearing the decks" in preparation for an important temp-test-ioccc pull request. See temp-test-ioccc comment 1567576879 for details.

We do plan to address comment 1563367357 afterwards.

xexyl commented 11 months ago

We are "clearing the decks" in preparation for an important temp-test-ioccc pull request. See temp-test-ioccc comment 1567576879 for details.

We do plan to address comment 1563367357 afterwards.

Good idea to do the merge conflict first! Having it done will help relieve some stress I am sure. I think it's a good idea how you're documenting it too so that we know how to do it in case it happens again. With that in mind maybe it should have been an issue as well? So that we could more easily access it. But I'll make a link to the pull request in the other comment in issue #5 so that we have an easier time to find it. I'm going to I hope reply to that comment now though it might take some time to do and I feel like I might have to take a break, like it or not, even before I finish it, but that's uncertain at this point: going to try reading it now and then replying though I might kind of do both at the same time as I often (usually maybe?) do.

lcn2 commented 11 months ago

Now that the temp-test-ioccc pull request 811 has been resolved, we will address comment 1563367357 and related jprint command line use cases.

... after we go to sleep and wake up and do chores and .. perhaps Thursday or Friday.

xexyl commented 11 months ago

Now that the temp-test-ioccc pull request 811 has been resolved, we will address comment 1563367357 and related jprint command line use cases.

... after we go to sleep and wake up and do chores and .. perhaps Thursday or Friday.

I'm glad you said: 'after we go to sleep ...' because it was my immediate thought when I saw the time!

I think Thursday or Friday is good too because I should be able to get the task done that I mentioned a number of times the past few days by then and with the exhaustion it's going to be slow times :( but we'll see.

Hope you're sleeping well my friend! I went to sleep a bit later and woke up earlier so it's probably going to be a difficult time in the sleepiness department later on but maybe I'll be lucky .. as Strider says to the hobbits: 'who can tell?'

lcn2 commented 11 months ago

Regarding comment comment 1563367357: The top level comment has an updated command line as per a number of your suggestions.

Comments, questions and corrections are, of course, welcome.

xexyl commented 11 months ago

Regarding comment comment 1563367357: The top level comment has an updated command line as per a number of your suggestions.

Comments, questions and corrections are, of course, welcome.

Looks great! My suggestions (which I started doing but am too tired to finish .. been awake since half past 1 and I hope to sleep after this or at least rest):

I'm sure I have other comments but I'm too tired to think right now. That being said see commit https://github.com/ioccc-src/mkiocccentry/pull/528/commits/757b14ecde8f3f883ab055ff69d61d8c6ee1116c.

Now time for a rest whether I can truly rest or not I need to try.

UPDATE 0

Quickly .. I think that maybe we can match ls in the sense of longer help string. I'll do that: later on either today or tomorrow. I'm afraid it might be tomorrow. I do have another issue to bring up in the other repo but that will be today or tomorrow. I hope to make it today actually as it should be pretty easy to bring up .. once I have a bit more energy and remember the full details: I'm vague at this time which is why it's time to close the laptop and turn the lights off and try and rest.

Do you agree that the long usage string is the way to go? Should the mkiocccentry tool also have the full synopsis in the usage string? It now shows something like [options] but lists the options below in full: just because it's so long. The man page does have the full synopsis. Well we can talk about that later ... rest and hopefully sleep time.

UPDATE 1

Making progress on this but I have to try and rest again .. if I can't rest I'll do other things so that I am still on a break here. I do think and hope to finish this part today. Hopefully I manage.

Before I do go I want to make another comment I think is useful ...

xexyl commented 11 months ago

See to TODO list in the top comment.

xexyl commented 11 months ago

I'm kind of running into a problem with the usage string and the code. Not that the code is difficult but rather the formatting and the length of it. I think that because of how hard last night was I will delay working on the rest of the usage string for now. I expect that if I don't return to it today I will get to it tomorrow.

Perhaps I can get to other things here or the other repo today too but we'll see. I would like to believe this is likely but I am afraid it probably isn't. Hope you have a good day and I'll either get back to this today or if not then tomorrow.

xexyl commented 11 months ago

As for ...

-g  grep-like extended regular expressions are used to match (def: name args are not regex)
            To match from the name beginning, start name_arg with ^
            To match to the name end, end name_arg with $
            To match the entire name, enclose name_arg between ^ and $
            Use of -g conflicts with -S

Do we want to make use of the regular expression library ? I mean if we're going to do the line boundaries should we go a bit further?

I am afraid that yes I am just too tired to do more today but I hope tomorrow I can finish the usage string and possibly come up with any additional thoughts in regards to it. I did notice some things in it that might not be entirely clear but I'm not sure if it was unclear to me or unclear full stop. I also fixed a typo (well something like it) but that's not in yet - it's in th pending changes too.

Good day!

UPDATE 0

Quickly .. did you write about the -S option listed in -g and I missed it or is it one you forgot to bring up? (I mean in the usage string.)

Leaving now.

UPDATE 1, 03 June 2023

You can ignore the question about the -S: I did indeed just miss it.

Also as for the grep idea see comment https://github.com/ioccc-src/mkiocccentry/issues/523#issuecomment-1574876698 for more thoughts about this.

xexyl commented 11 months ago

With commit https://github.com/ioccc-src/mkiocccentry/pull/528/commits/d2a42b637779e10ea59021ac3fb5f045330eb78f the usage string is for this time complete:

--

New jprint version - 'complete' help string

After many, many, many (many, many, many!) :-) new inodes[0], the usage string is at this time complete: for certain values of complete :-) There are some things that are unclear to me but these will be ironed out with discussion on GitHub (not that it's physically on or IN GitHub :-) ) and then the usage string can be fixed.

Why a new version? Because it is significant even if there is no functional addition (usage string was always printed).

Note that not even the new options are accepted and none of the options accepted are actually parsed. This can come later once details are ironed out though I possibly will add them to the getopt() call and switch() cases - but not right now.

Required number of args is 2.

[0] To those who do not know every time you save a file it creates a new inode for the same file. If you don't understand why this is I suggest you look it up (as I'm not about to explain since I've been awake since stupid o'clock and it's still stupid o'clock and I want to try and rest soon).

--

I will discuss the issues that are unclear to me and any other thoughts I had later on .. today or tomorrow if not today: or so I hope.

Hope you're sleeping well. I'll be trying to rest soon .. I hope. Been awake since 1.

xexyl commented 11 months ago

[0] To those who do not know every time you save a file it creates a new inode for the same file. If you don't understand why this is I suggest you look it up (as I'm not about to explain since I've been awake since stupid o'clock and it's still stupid o'clock and I want to try and rest soon).

Hope you're sleeping well. I'll be trying to rest soon .. I hope. Been awake since 1.

BTW if you don't know what stupid o'clock is: it's a funny Britishism for stupid (as in extremely :-) ) early or stupid late. Unfortunately in my case extremely early but at least I did go to sleep earlier (for this reason I feared). But it was a pretty awful night too (though not as awful as the previous night) so it'll probably be a difficult day though I hope I'm wrong there. Might be okay once I wake up though obviously it'll be long.

UPDATE 0

On the other hand it's been pretty productive even in the wee hours of the morning!

xexyl commented 11 months ago

Current usage string btw (with several typo fixes though being so tired it's possible I have missed some):

$ ./jprint -h
usage: ./jprint [-h] [-V] [-v level] [-J level] [-e] [-Q] [-t type] [-q] [-j lvl] [-i count]
        [-N num] [-p {n,v,b}] [-b {t,number}] [-L] [-c] [-C] [-B] [-I {t,number}] [-j] [-E]
        [-I] [-S] [-g] file.json [name_arg ...]

    -h      Print help and exit
    -V      Print version and exit
    -v level    Verbosity level (def: 0)
    -J level    JSON verbosity level (def: 0)
    -e      Print JSON strings as encoded strings (def: decode JSON strings)
    -Q      Print JSON strings surrounded by double quotes (def: do not)
    -t type     Print only if JSON value matches one of the comma-separated
            types (def: simple):

                int     integer values
                float       floating point values
                exp     exponential notation values
                num     alias for int,float,exp
                bool        boolean values
                str     string values
                null        null values
                simple      alias for 'num,bool,str,null' (the default)
                object      JSON objects
                array       JSON array
                compound    alias for object,array
                any     any type of value

    -q      Quiet mode (def: print stuff to stdout)

    -l lvl      Print values at specific JSON levels (def: any level, '0:')
            If lvl is a number (e.g. '-l 3'), level must == number
            If lvl is a number followed by : (e.g. '-l 3:'), level must be >= number
            If lvl is a : followed by a number (e.g. '-l :3'), level must be <= number
            If lvl is num:num (e.g. '-l 3:5'), level must be inclusively in the range

    -i count    Print up to count matches (def: print all matches)
            If count is a number (e.g. '-i 3'), the matches must == number
            If count is a number followed by : (e.g. '-i 3:'), matches must be >= number
            If count is a : followed by a number (e.g. '-i :3'), matches must be <= number
            If count is num:num (e.g. '-i 3:5'), matches must be inclusively in the range
            NOTE: when number < 0 it refers to the instance from last: -1 is last, -2 second to last ...

    -N num      Print only if there are only a given number of matches (def: do not limit)
            If num is only a number (e.g. '-l 1'), there must be only that many matches
            If num is a number followed by : (e.g. '-l 3:'), there must >= num matches
            If num is a : followed by a number (e.g. '-i :3'), there must <= num matches
            If num is num:num (e.g. '-i 3:5'), the number of matches must be inclusively in the range

    -p {n,v,b}  print JSON key, value or both (def: print JSON values)
            if the type of value does not match the -t type specification,
            then the key, value or both are not printed.
    -p name     Alias for '-p n'.
    -p value    Alias for '-p v'.
    -p both     Alias '-p v'.

    -b {t,number}   print between name and value (def: 1)
            print a tab or spaces (i.e. '-b 4') between the name and value.
            Use of -b {t,number} without -j or -p b has no effect.
    -b tab      Alias for '-b t'.

    -L      Print JSON levels, followed by tab (def: do not print levels).
            The root (top) of the JSON document is defined as level 0.

    -c      When printing -j both, separate name/value by a : (colon) (def: do not)
            NOTE: When -C is used with -b {t,number}, the same number of spaces or tabs
            separate the name from the : (colon) AND a number of spaces or tabs
            and separate : (colon) from the value by the same.

    -C      When printing JSON syntax, always print a comma after final line (def: do not).
            Use of -C without -j has no effect.

    -B      When printing JSON syntax, start with a { line and end with a } line
            Use of -B without -j has no effect.

    -I {t,number}   When printing JSON syntax, indent levels (i.e. '-I 4') (def: do not indent i.e. '-I 0')
            Indent levels by tab or spaces (i.e. '-t 4').
            Use of -I {t,number} without -j has no effect.
    -I tab      Alias for '-I t'.

    -j      Print using JSON syntax (def: do not).
            Implies '-p b -b 1 -c -e -Q -I 4 -t any'.
            Subsequent use of -b {t,number} changes the printing between JSON tokens.
            Subsequent use of -I {t,number} changes how JSON is indented.
            Subsequent use of -t type will change which JSON values are printed.
            Use of -j conflicts with use of '-p {n,v}'.

    -E      Match the JSON encoded name (def: match the JSON decoded name).
    -I      Ignore case of name (def: case matters).
    -S      Substrings are used to match (def: the full name must match).
    -g      grep-like extended regular expressions are used to match (def: name args are not regexps).
            To match from the name beginning, start name_arg with '^'.
            To match to the name end, end name_arg with '$'.
            To match the entire name, enclose name_arg between '^' and '$'.
            Use of -g conflicts with -S.

    file.json   JSON file to parse
    name_arg    JSON element to print
jprint version: 0.0.1 2023-06-03

but as I said there are some things I'm unclear about. I'll get back to you later on that. I'm going to see about resting very soon or so that's my current plan. I don't know if I can do the getopt() update today or tomorrow but I hope if not today then indeed tomorrow. However it'll only be the getopt() and switch() update as the details of the options have to be ironed out for sure. Well maybe some don't and those might be added. I might also check that the file exists and is a regular file but that's another matter entirely. I won't yet open it: just check for it being a regular file. But that can be done at a later point today or in the coming days.

xexyl commented 11 months ago

With commit https://github.com/ioccc-src/mkiocccentry/pull/531/commits/79734f163c6e4a3ebb565c0235dcbdcec48fabba in pull request https://github.com/ioccc-src/mkiocccentry/pull/531 I have made what I feel to be tremendous progress and what will allow us to work on the parsing of options and then immediately add code (once the details are discussed) to search for matches! I forgot in the log that I updated the jparse/README.md and the CHANGES.md file but the commit log is:

--

New jprint version - test and report valid JSON

New number of required args is 1: the file to check. It was an error to require a pattern to match according to the spec and usage string as well.

The jprint tool now checks that the first arg exists as a file, is a regular file and that it can be opened for reading. If any of these are not true it is an error. Otherwise the file is checked for valid JSON.

Once file is parsed (or we attempt to parse) as JSON the file is closed and the file pointer is set to NULL. If invalid JSON print invalid and exit. If valid it depends on whether another arg is specified or not. If another arg is specified we will show the pattern requested (this will not be done later on) and exit 0; otherwise we will print that no pattern is requested (this will also not be done later) and exit 1.

Besides the initial parsing of options (to the extent that we have details) we can do no more here until more details are ironed out.

TODO discuss option details and parsing, add code to parse and then refine anything necessary. After this we can write the rest of the code, the pattern matching. We also need to decide if more than one name_arg can be specified. I'm mixed on this one: with grep -E one can do this so I think it would be wise to do this but this will be determined at a later date.

--

We can add more code once the details are discussed but now the only thing to be done is parsing of options and then processing the options and the tree if valid!

xexyl commented 11 months ago

QUESTION about name_arg: do we want to allow more than one ?

I personally am in favour of this as it is kind of like grep and with grep -E this is possible. We could have a loop (for example) that does the same checks on each pattern.

Do we want way to specify that ALL patterns have to be matched in the requested restrictions ?

Although the latter would complicate the tool it would make it more useful still I think.

What do you think ?

I'm going to do other things now and in particular I'm going to try and rest. I hope to later today discuss the usage string more but if not I feel great about what I did! Thus if I don't do anything else I think that's more than fine.

UPDATE 0

I see that it is supposed to allow more than one pattern but what about the additional feature I mentioned, an AND operator?

xexyl commented 11 months ago

As for ...

-g    grep-like extended regular expressions are used to match (def: name args are not regex)
          To match from the name beginning, start name_arg with ^
          To match to the name end, end name_arg with $
          To match the entire name, enclose name_arg between ^ and $
          Use of -g conflicts with -S

Do we want to make use of the regular expression library ? I mean if we're going to do the line boundaries should we go a bit further?

An argument I thought for doing this is that it won't require (I think) any external libs and it might simplify the pattern matching code as well as allowing for the word boundaries that we are going to support and without any special code.

We also could get rid of both -g and -S.

Thoughts ?

xexyl commented 11 months ago

the --report all option of bison flags - do we need it?

Do we want to still have this? It was useful when we were developing the parser but is it now? If nothing else perhaps we should disable it prior to the public review? Thoughts ?

xexyl commented 11 months ago

man page to be done

For the following reasons I am going to wait on this until more is done with discussion.

  1. I believe it's better to wait as things might change.
  2. I also am quite dreading it: it's a lot to document!

I did add it briefly to the jparse README.md however but only VERY briefly.

lcn2 commented 11 months ago

Regarding comment comment 1563367357: The top level comment has an updated command line as per a number of your suggestions. Comments, questions and corrections are, of course, welcome.

Looks great! My suggestions (which I started doing but am too tired to finish .. been awake since half past 1 and I hope to sleep after this or at least rest):

  • the list of options and explanations are so long so maybe it should just give the summary of each and refer to the to-be-written man page? I'm not sure: some commands do have longer output of help. ls in linux is a great example. Not sure: what do you think here? Certainly examples will be needed in the man page though...

Cute jokes about the historic ls(1) man page aside, the top level comment is NOT a help message, nor is it a jprint(1) man page.

The top level comment was written to help explain to the coder 🤓, use cases and requested function for the jprint(1) command.

The actual help message can be more compact than the top level comment.

  • Since the list will already be so long I suggest the exit codes are not in the help string but rather in the man page once it's been written.

We disagree, as the exit codes are critical to the -q option. BTW: The exit codes mentioned in the top level comment can be shortened by combining exit codes 4 thru 6 that were mentioned into a single exit code.

lcn2 commented 11 months ago

TODO list

I think we could do with a todo list here. I'll start it but you can reword or expand or shrink it as you see fit.

discuss thoroughly all options and other details

  • [ ] discuss thoroughly all options
  • [ ] write usage string (working on this but this does not necessarily mean that the first task is done! - this usage might help modify the above some)
  • [ ] discuss details of the tool itself.
  • [ ] once synopsis is figured out and details we can START the man page: this will be done in steps I think. For the beginning just have up to the description (what we have) and maybe exit codes as well. We can add examples, notes, bugs later.
  • [ ] Write code to parse options. I can at first put in empty functions that take the arg string(s) but do nothing until more is decided upon.
  • [ ] Write the rest of the code
  • [ ] Finish the man page.
  • [ ] Test - by writing a test suite and adding it to the jparse test script.
  • [ ] Run bug_report.sh (making sure it runs the new tool test suite) on a variety of systems.
  • [ ] Update version to 1.0.0 ? Not sure on this ...

Somewhere in there we might want to update the CHANGES.md file too.

And now for a rest or break of some kind: probably not a rest I'm afraid .. can feel my body won't let me but I'll have to have a nap later on.

Good ideas.

xexyl commented 11 months ago

Regarding comment comment 1563367357: The top level comment has an updated command line as per a number of your suggestions. Comments, questions and corrections are, of course, welcome.

Looks great! My suggestions (which I started doing but am too tired to finish .. been awake since half past 1 and I hope to sleep after this or at least rest):

  • the list of options and explanations are so long so maybe it should just give the summary of each and refer to the to-be-written man page? I'm not sure: some commands do have longer output of help. ls in linux is a great example. Not sure: what do you think here? Certainly examples will be needed in the man page though...

Cute jokes about the historic ls(1) man page aside, the top level comment is NOT a help message, nor is it a jprint(1) man page.

The top level comment was written to help explain to the coder 🤓, use cases and requested function for the jprint(1) command.

Sure. The actual help message can be more compact than the top level comment.

Good - see below.

  • Since the list will already be so long I suggest the exit codes are not in the help string but rather in the man page once it's been written.

We disagree, as the exit codes are critical to the -q option. BTW: The exit codes mentioned in the top level comment can be shortened by combining exit codes 4 thru 6 that were mentioned into a single exit code.

Sure. I can add that when I take care of the rest.

I agree with shortening it but it can come another time - hopefully tomorrow. When I do that I'll add the exit codes.

lcn2 commented 11 months ago

New jprint version - test and report valid JSON

The "and report valid JSON" is not a function of jprint, it is side effect of having to parse the JSON in order to print JSON stuff.

No more than cat(1) is a "concatenate and print files and test if files exist". :-)

xexyl commented 11 months ago

New jprint version - test and report valid JSON

The "and report valid JSON" is not a function of jprint, it is side effect of having to parse the JSON in order to print JSON stuff.

No more than cat(1) is a "concatenate and print files and test if files exist". :-)

Sure but the message there is temporary.

xexyl commented 11 months ago

Of course I had only the to go by and was unsure so I felt like it would be fine to put the long string in. It will be nicer once it's shorter. I also think that the top level comment can be used to make the man page in part even though it's of course not the man page.

By this I mean that alias descriptions would be useful to have in the man page but might not be in the usage string.

xexyl commented 11 months ago

New jprint version - test and report valid JSON

The "and report valid JSON" is not a function of jprint, it is side effect of having to parse the JSON in order to print JSON stuff. No more than cat(1) is a "concatenate and print files and test if files exist". :-)

Sure but the message there is temporary.

Well as I put in an XXX maybe it will be a debug call if desired. But it was only put there now for informative purposes as there's nothing else to do until more discussion.

It's a useful thing for now but of course it won't be necessary later on.

UPDATE 0

Have to leave for a bit .. though I might try napping too at some point. I'll have the laptop out a bit longer today but I won't be doing more here - at least not commit wise. Maybe I can reply to comments here and there but that's not certain.

lcn2 commented 11 months ago

The jprint tool now checks that the first arg exists as a file, is a regular file and that it can be opened for reading. If any of these are not true it is an error. Otherwise the file is checked for valid JSON.

As a side effect of having to open the JSON file in order to parse it, the "is a regular file and that it can be opened for reading" is a side effect.

Of course, the - (read from stdin) does none of that.

That jprint(1) should call access(2) (unless reading from stdin) before calling the jparse(3) function it a nice thing to do .. it provides a more controlled / useful diagnostic on the file.json argument.

xexyl commented 11 months ago

The jprint tool now checks that the first arg exists as a file, is a regular file and that it can be opened for reading. If any of these are not true it is an error. Otherwise the file is checked for valid JSON.

As a side effect of having to open the JSON file in order to parse it, the "is a regular file and that it can be opened for reading" is a side effect.

Of course, the - (read from stdin) does none of that.

That jprint(1) should call access(2) (unless reading from stdin) before calling the jparse(3) function it a nice thing to do .. it provides a more controlled / useful diagnostic on the file.json argument.

I'll do that in a future commit then. Do you have a message you'd like it to show?

xexyl commented 11 months ago

Commit https://github.com/ioccc-src/mkiocccentry/pull/533/commits/c30cfab99dca5bacedae22ccde55697ce6b54952 is an attempt to clarify the use of the tool for the time being - until it can be further fleshed out.

For now it does print the valid / invalid and pattern requested / not requested but these will be removed once more is done. I believe comments and the text in the README.md should help clarify the use of the tool at least a little bit.

lcn2 commented 11 months ago

QUESTION about name_arg: do we want to allow more than one ?

I personally am in favour of this as it is kind of like grep and with grep -E this is possible. We could have a loop (for example) that does the same checks on each pattern.

Do we want way to specify that ALL patterns have to be matched in the requested restrictions ?

Although the latter would complicate the tool it would make it more useful still I think.

What do you think ?

Interesting question. We tried to figure out how to do the "foo" above "bar" above "baz" syntax.

Complications in the name_arg parsing to do something like "foo:bar:baz" became awkward when a JSON name can have a : inside it:

{ "name:stuff" : true }

The question might be: what if you wanted to print values of both "_totalguests" and "_guestnumber"? Is allowing that use case in the command line worth the complication? Perhaps not. Perhaps just running the command twice:= be good enough?

jprint jparse/test_jparse/test_JSON/good/party.json total_guests
jprint jparse/test_jparse/test_JSON/good/party.json guest_number

If you have a better way to asking for birthdays -> hobbit -> age on the command line, feel free to suggest it.

Given the "run the command twice" idea mention above, this seemed reasonable without trying to invent some ":" separator:

jprint jparse/test_jparse/test_JSON/good/party.json birthdays hobbit age
xexyl commented 11 months ago

QUESTION about name_arg: do we want to allow more than one ?

I personally am in favour of this as it is kind of like grep and with grep -E this is possible. We could have a loop (for example) that does the same checks on each pattern. Do we want way to specify that ALL patterns have to be matched in the requested restrictions ? Although the latter would complicate the tool it would make it more useful still I think. What do you think ?

Interesting question. We tried to figure out how to do the "foo" above "bar" above "baz" syntax.

Complications in the name_arg parsing to do something like "foo:bar:baz" became awkward when a JSON name can have a : inside it:

{ "name:stuff" : true }

The question might be: what if you wanted to print values of both "_totalguests" and "_guestnumber"? Is allowing that use case in the command line worth the complication? Perhaps not. Perhaps just running the command twice:= be good enough?

jprint jparse/test_jparse/test_JSON/good/party.json total_guests
jprint jparse/test_jparse/test_JSON/good/party.json guest_number

If you have a better way to asking for birthdays -> hobbit -> age on the command line, feel free to suggest it.

Given the "run the command twice" idea mention above, this seemed reasonable without trying to invent some ":" separator:

jprint jparse/test_jparse/test_JSON/good/party.json birthdays hobbit age

Nothing comes to mind, no. It's an interesting and maybe useful idea but only if some syntax can be devised. Nothing comes to my mind at this time at least.

Should there be a count option similar to grep -c ? Maybe there is and I missed it.

Well really going afk a bit ... can reply more later or tomorrow.

lcn2 commented 11 months ago

As for ...

-g  grep-like extended regular expressions are used to match (def: name args are not regex)
            To match from the name beginning, start name_arg with ^
            To match to the name end, end name_arg with $
            To match the entire name, enclose name_arg between ^ and $
            Use of -g conflicts with -S

Do we want to make use of the regular expression library ? I mean if we're going to do the line boundaries should we go a bit further?

An argument I thought for doing this is that it won't require (I think) any external libs and it might simplify the pattern matching code as well as allowing for the word boundaries that we are going to support and without any special code.

We also could get rid of both -g and -S.

Thoughts ?

The regex(3) API is a standard part of libc, so -g should be doable.

The strstr(3) and strcasestr(3) functions, needed to support -S and -I is also a libc.

And the use cases for both -g and -S are important.

lcn2 commented 11 months ago

the --report all option of bison flags - do we need it?

Do we want to still have this? It was useful when we were developing the parser but is it now? If nothing else perhaps we should disable it prior to the public review? Thoughts ?

These are not important at this point.

xexyl commented 11 months ago

As for ...

-g    grep-like extended regular expressions are used to match (def: name args are not regex)
          To match from the name beginning, start name_arg with ^
          To match to the name end, end name_arg with $
          To match the entire name, enclose name_arg between ^ and $
          Use of -g conflicts with -S

Do we want to make use of the regular expression library ? I mean if we're going to do the line boundaries should we go a bit further?

An argument I thought for doing this is that it won't require (I think) any external libs and it might simplify the pattern matching code as well as allowing for the word boundaries that we are going to support and without any special code. We also could get rid of both -g and -S. Thoughts ?

The regex(3) API is a standard part of libc, so -g should be doable.

The strstr(3) and strcasestr(3) functions, needed to support -S and -I is also a libc.

And the use cases for both -g and -S are important.

Right. So the question is should we use regex(3)? That was what I was getting at.

Certainly the other functions are useful of course too.

xexyl commented 11 months ago

the --report all option of bison flags - do we need it?

Do we want to still have this? It was useful when we were developing the parser but is it now? If nothing else perhaps we should disable it prior to the public review? Thoughts ?

These are not important at this point.

Thanks.

lcn2 commented 11 months ago

The jprint tool now checks that the first arg exists as a file, is a regular file and that it can be opened for reading. If any of these are not true it is an error. Otherwise the file is checked for valid JSON.

As a side effect of having to open the JSON file in order to parse it, the "is a regular file and that it can be opened for reading" is a side effect. Of course, the - (read from stdin) does none of that. That jprint(1) should call access(2) (unless reading from stdin) before calling the jparse(3) function it a nice thing to do .. it provides a more controlled / useful diagnostic on the file.json argument.

I'll do that in a future commit then. Do you have a message you'd like it to show?

If one were reduce the exit codes to:

       0      all is OK, file is valid JSON, match(s) found or no name_arg given
       1      file is valid JSON, name_arg given but no matches found
       2      -h and help string printed or -V and version string printed
       3      invalid command line, invalid option or option missing an argument
       4      file does not exist, not a file, or unable to read the file
       5      file contents is not valid JSON
       >= 10  internal error

The one could reuse the same exit code for pre-checks on trying to open a file before calling parse_json(3).

Thus one could make calls such as:

errp(4, "failed to open %s", filename);

The exact format of the stderr message isn't that critical: just use the err(3) or errp(3) interface as needed.

You will need to decide if you want to pre-load the file into memory and call the parse_json(3) function .. or use the parse_json_stream(3) and parse_json_file(3) interfaces.

Regardless, we do recommend you call stat(2) and access(2) as pre-checks prior to calling the parser. This will allow the jprint(1) tool to better control the exit codes and to provide the user with informative error messages.

lcn2 commented 11 months ago

Should there be a count option similar to grep -c ? Maybe there is and I missed it.

That is a good idea. Please add it.

xexyl commented 11 months ago

Should there be a count option similar to grep -c ? Maybe there is and I missed it.

That is a good idea. Please add it.

Recommended option letter ?

Perhaps -0 (zero) ?

xexyl commented 11 months ago

The jprint tool now checks that the first arg exists as a file, is a regular file and that it can be opened for reading. If any of these are not true it is an error. Otherwise the file is checked for valid JSON.

As a side effect of having to open the JSON file in order to parse it, the "is a regular file and that it can be opened for reading" is a side effect. Of course, the - (read from stdin) does none of that. That jprint(1) should call access(2) (unless reading from stdin) before calling the jparse(3) function it a nice thing to do .. it provides a more controlled / useful diagnostic on the file.json argument.

I'll do that in a future commit then. Do you have a message you'd like it to show?

If one were reduce the exit codes to:

       0      all is OK, file is valid JSON, match(s) found or no name_arg given
       1      file is valid JSON, name_arg given but no matches found
       2      -h and help string printed or -V and version string printed
       3      invalid command line, invalid option or option missing an argument
       4      file does not exist, not a file, or unable to read the file
       5      file contents is not valid JSON
       >= 10  internal error

The one could reuse the same exit code for pre-checks on trying to open a file before calling parse_json(3).

Thus one could make calls such as:

errp(4, "failed to open %s", filename);

The exact format of the stderr message isn't that critical: just use the err(3) or errp(3) interface as needed.

You will need to decide if you want to pre-load the file into memory and call the parse_json(3) function .. or use the parse_json_stream(3) and parse_json_file(3) interfaces.

Regardless, we do recommend you call stat(2) and access(2) as pre-checks prior to calling the parser. This will allow the jprint(1) tool to better control the exit codes and to provide the user with informative error messages.

I agree with this. I'll do it. If I have any other thoughts or questions with regards to any of this I'll ask at that point.

xexyl commented 11 months ago

The jprint tool now checks that the first arg exists as a file, is a regular file and that it can be opened for reading. If any of these are not true it is an error. Otherwise the file is checked for valid JSON.

As a side effect of having to open the JSON file in order to parse it, the "is a regular file and that it can be opened for reading" is a side effect. Of course, the - (read from stdin) does none of that. That jprint(1) should call access(2) (unless reading from stdin) before calling the jparse(3) function it a nice thing to do .. it provides a more controlled / useful diagnostic on the file.json argument.

I'll do that in a future commit then. Do you have a message you'd like it to show?

If one were reduce the exit codes to:

       0      all is OK, file is valid JSON, match(s) found or no name_arg given
       1      file is valid JSON, name_arg given but no matches found
       2      -h and help string printed or -V and version string printed
       3      invalid command line, invalid option or option missing an argument
       4      file does not exist, not a file, or unable to read the file
       5      file contents is not valid JSON
       >= 10  internal error

The one could reuse the same exit code for pre-checks on trying to open a file before calling parse_json(3). Thus one could make calls such as:

errp(4, "failed to open %s", filename);

The exact format of the stderr message isn't that critical: just use the err(3) or errp(3) interface as needed. You will need to decide if you want to pre-load the file into memory and call the parse_json(3) function .. or use the parse_json_stream(3) and parse_json_file(3) interfaces. Regardless, we do recommend you call stat(2) and access(2) as pre-checks prior to calling the parser. This will allow the jprint(1) tool to better control the exit codes and to provide the user with informative error messages.

I agree with this. I'll do it. If I have any other thoughts or questions with regards to any of this I'll ask at that point.

Though I wonder: is there a purpose when the exists() function already uses stat() to use stat() ?

Well I'll look at it hopefully tomorrow and it might be clearer then.

lcn2 commented 11 months ago

Should there be a count option similar to grep -c ? Maybe there is and I missed it.

That is a good idea. Please add it.

Recommended option letter ?

Perhaps -0 (zero) ?

One could rename the proposed -c and -C to different letters and use -c for count as well.