jeffreykegler / yahc

Yet Another Hoon Compiler
MIT License
0 stars 1 forks source link

Wuthep suppression candidates #4

Closed jeffreykegler closed 5 years ago

jeffreykegler commented 5 years ago

@ohAitch I have a new list of the WUTHEP (?-) anomalies, with improved diagnostics.
wuthep.lint.txt

All of these are, I think, candidates for suppression.

Where it is ambiguous (and these are often anomalies because of sidedness inconsistences) the chess-sidedness is decided based on the jogging twig. The sidedness of the jogging twig is that of the majority of its jogs. A jog is kingside if it is a single-line and has kingside indentation, or is underindented for a kingside job. Otherwise a jog is queenside.

This notion of sidedess is more intuitive than the previous attempt, but short of mind-reading, nothing will always match the intuition of what was intended by erroneous indentation.

For the 2nd child of flat jogs, the intended indentation is taken to be the most common one. If there is a tie, the first to occur in the jogging is used. For very long jogging's, it is sometimes hard to see by eye what the most common indentation is, which is why I suppose quite a number of anomalies appear in the indentation of the 2nd twigs of flat jogs.

This replaces issues #2 and #3 , which I will close.

ohAitch commented 5 years ago

The "misindented child 1" lints look great!

Notes on others:

Some of the specific file styles:

jeffreykegler commented 5 years ago

I have spun the matter of combining "nearby" lint contexts off into its own issue, issue #5

jeffreykegler commented 5 years ago

I've changed hoonlint to allow multiple suppression files, for the next phase of attack on these. I propose to take three approaches, depending on the issue:

  1. Fix hoonlint to accept the behavior. It looks like the 2nd and 3rd bulleted items should be handled this way.
  2. Suppress them, for tracking purposes, until we file PR's against the arvo code. I'll track these suppressions in a "mistakes.suppressions" file. Looks like zuse.hoon is an example of this.
  3. Suppress them either permanently or until "later". It looks like the app/static issues are examples of these. The "arvo.suppressions" file will be used for these.

Someday we will want lint options, severity levels, etc., but until it's clearer what these should be the suppression files will be my way of sorting "sheep from goats".

jeffreykegler commented 5 years ago

@ohAitch I just tested -- if a single queenside jog causes the whole jogging to be assumed to be intended as queenside, it causes 1000s of additional lint complaints more than it eliminates -- in fact it more than double the total of lint complaints in the arvo/ corpus. I looked at this by census, and that's why I went for counting kingside vs. queenside jobs, and going with the majority.

ohAitch commented 5 years ago
  1. sorry I should have numbered those; the non-file-specific ones?
  2. mistakes.suppressions is one of the high-value outputs of the project, indeed!
  3. app/static should just be "mistakes" wrt being underindented, but e.g. "seaside" style qualifies yeah.

Wrt "intended as queenside", I'm having some trouble following and/or communicating this so I'm going to split up some terminology:

As a more concrete test, does hoonlint recognize the example I provided as valid, but lint

the following variations: (not necessarily on the same grounds as commented; ideally correcting to the example, at least in non-`?` cases) ``` :: underindented preamble ?- a %b /b %cd /cd %e /e/le/ments == :: overindented preamble ?- a %b /b %cd /cd %e /e/le/ments == :: underindented preamble, overindented kingside jog head (>2) ?- a %b /b %cd /cd %e /e/le/ments == :: overindented preamble, overindented kingside jog head (>2) ?- a %b /b %cd /cd %e /e/le/ments == :: underindented kingside jog head (<1) ?- a %b /b %cd /cd %e /e/le/ments == :: underindented kingside jog head (inconsistent) ?- a %b /b %cd /cd %e /e/le/ments == :: underindented kingside jog head (inconsistent) ?- a %b /b %cd /cd %e /e/le/ments == :: 2x underindented kingside jog head (inconsistent) ?- a %b /b %cd /cd %e /e/le/ments == :: overindented kingside jog head ?- a %b /b %cd /cd %e /e/le/ments == :: underindented ???side jog head (inconsistent) ?- a %b /b %cd /cd %e /e/le/ments == :: overindented queenside jog head, "underindented" queenside jog body ?- a %b /b %cd /cd %e /e/le/ments == :: overindented queenside jog head ?- a %b /b %cd /cd %e /e/le/ments == :: under?indented queen?side jog head ?- a %b /b %cd /cd %e /e/le/ments == :: underindented queenside jog head (<2) ?- a %b /b %cd /cd %e /e/le/ments == :: underindented king/sea side jog body ?- a %b /b %cd /cd %e /e/le/ments == :: overindented kingside jogs ?- a %b /b %cd /cd %e /e/le/ments == ```


More pessimistically, all of the above may be redundant, your previous comment being "I tried a variant of yahc that implemented such, unfortunately while the proposed standard is nice it bears little resemblance to how mixed jogging is actually used in the codebase". In this case, abstracting out the 1000s of lints, could I get a summary of which ?- runes are treated as containing misindented jogs under the "majority" scheme but not the "indent to deepest" one, vice versa, which are agreed to be bad by both? The same sets computed if you ignore a single(presumably queenside) non-majority indented jog?

jeffreykegler commented 5 years ago

@ohAitch This last is helpful. Over the next day or 2, I'll make sure that hoonlint produces messages consistent with what you've described above.

One problem we've been having is that I have been treating kingside/queenside as a binary property of the whole jogging and its parent hoon, and have been treating mixed joggings as necessarily misformed, and requiring conversion to one or the other. This may have a lot to do with the 1000s of new lint messages.

Btw, is "twig" a better word for one of the two jog components than "child"?

ohAitch commented 5 years ago

Yeah, the expansion here was because I realized I might have been unclear earlier with "mixed queenside", with which I had meant to refer to the pattern of "heads indented by 2 as in queenside, but most of the individual jogs are kingside" present in a bunch of places :)

(also I linked the 2019 style guide, which does treat them as a binary property; perhaps useful for grade-A "space-ready" code, but see my "doesn't acknowledge" comment)

Twig is probably better than child, "head and body" of jogs is even more descriptive; I've been struggling for a less awkward phrasing than "1st child" for the part of a jogging rune that's not in a jog, I settled on "preamble" in the test cases but that's still a bit unwieldy.

ohAitch commented 5 years ago

(like, "target" probably makes sense for all of ?- ?+ %= %_ %*, which I think are the only jogging runes, but it's less inherently positional which is useful when discussing indentation)

ohAitch commented 5 years ago

Actually, for the % ones "target" better describes the jog heads, "source" maybe? "Referent" might be suitably generic, "thing we're pattern-matching or replacing parts of"

jeffreykegler commented 5 years ago

@ohAitch I have worked up an approach to jogging indentation. It correctly spots all but one of the errors in your test file and, as I shall describe, I believe the one omission is reasonable. It handles the hoon examples and the arvo/ corpus well, and I have created a formal document describing it. In the document, I departed from your terminology in some respects, for reasons I will describe in a followup comment.

jeffreykegler commented 5 years ago

Re terminology: I used "head" instead of preamble for the 1st child of WUTHEP. "Preamble" is a big word and big words are OK, but in this case it didn't really apply well -- the 1st child isn't a "preamble" in any of the usual senses. Downside is we need to be careful to distinguish between the head of a jogging hoon, and a jog head.

I needed a definition of chess-sidedness for joggings and jogging, otherwise there is no way to describe why the jogging hoon head is indented 1 stop in one case, but 2 cases in others. Intuitively, this is because its parent jogging hoon is kingside/queenside.

To do this, I pulled apart the various ideas behind kingside/queenside and put them back together so that chess-sidedness depends on the "side of the board" toward which the jogs gravitate, and has nothing to do with "flat" (one-line) or "split" (multi-line). I hope you will find that this reanalysis of the idea works well.

jeffreykegler commented 5 years ago

A formal document describing the various definitions is here. It may be easier to read it after the discussion in the next few comments of 3 of your examples. A lint output for your file of jogging examples is issue4.lint.txt

jeffreykegler commented 5 years ago

Your first jogging example is easy:

::  underindented preamble
?-  a
    %b  /b
    %cd  /cd
    %e
  /e/le/ments
==

All three jogs are indented 2 stops more than the rune, so the jogging is queenside and the jogging hoon's head must also be queenside. So, yes it is underindented. hoonlint: issue4.hoon 2:1 indent tallWuthep 1-jogging queenside head @2:5; underindented by 2

jeffreykegler commented 5 years ago

In the jogging example starting on line 16, I diagnose the same errors, but use different terminology to describe them:

::  underindented preamble, overindented kingside jog head (>2)
?-  a
      %b  /b
    %cd  /cd
    %e
  /e/le/ments
==

All 3 jogs are indented at least 2 stops, so the jogging is considered queenside. Hoonlint finds the same 2 errors:

== issue4.hoon 16:1 indent tallWuthep 1-jogging queenside head @16:5; underindented by 2
== issue4.hoon 17:7 indent tallWuthep Jog queenside head @17:7; overindented by 2

but its messages differ because hoonlint follows the actual indentation or physical sidedness in determining king vs. queenside.

jeffreykegler commented 5 years ago

My new approach finds all the errors and (modulo my terminology) diagnoses them the same, with one exception, your last example:

:: overindented kingside jogs
?-    a
    %b  /b
    %cd  /cd
    %e  /e/le/ments
==

According to hoonlint, this is a queenside jogging hoon with 3 flat queenside jogs. Hoonlint considers this queenside because all the jogs are indented 2 stops beyond the rune. Hoonlint finds no problems to warn about in this code.

You might want to call out a queenside jogging hoon with only flat jogs with a lint warning, but here's a reason not to do so. You may be laying out a very long jogging, one which you have decided will be queenside, and which you indent according. But perhaps the 1st dozen or two of the 100 or so jogs are flat. It would be a nuisance if you have to fight hoonlint and hoonfmt to do this.

ohAitch commented 5 years ago

Ah okay, this all looks a lot closer to what the original "chess-side" terminology was trying to invoke than what I've described yup!

Re: jogging head vs jog head, that's the ambiguity I was trying to avoid yeah, you're probably right that brevity is worth it here.

Adopting to their terminology change: having the "queenside jogging with no slit bodies" lint be optional makes sense; probably still want to run it on any code that hasn't been touched in, idk, a week.

Re: "correct split kingside is to be one stop more indented than the head", presumably this better describes the codebase than "split aligned"(gen/musk) indentation? (If "side" refers to just the heads, let's definitely nix "seaside" terminology, since it fundamentally describes only the bodies)

Relatedly, is "flat" seems like not ideal terminology for jog bodies which are after their respective jog heads but span multiple lines; normal hoon "flat" form ofc also has a variant of this problem.

Re: 0-jogging, don't ?& and ?| take a list of twigs, and not pairs of twigs? Is jogging as opposed to running style common, as a space-saving measure?

On Tuesday, 5 February 2019, Jeffrey Kegler notifications@github.com wrote:

My new approach finds all the errors and (modulo my terminology) diagnoses them the same, with one exception, your last example:

:: overindented kingside jogs ?- a %b /b %cd /cd %e /e/le/ments

According to hoonlint, this is a queenside jogging hoon with 3 flat queenside jogs. Hoonlint considers this queenside because all the jogs are indented 2 stops beyond the rune. Hoonlint finds no problems to warn about in this code.

You might want to call out a queenside jogging hoon with only flat jogs with a lint warning, but here's a reason not to do so. You may be laying out a very long jogging, one which you have decided will be queenside, and which you indent according. But perhaps the 1st dozen or two of the 100 or so jogs are flat. It would be a nuisance if you have to fight hoonlint and hoonfmt to do this.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jeffreykegler/yahc/issues/4#issuecomment-460873644, or mute the thread https://github.com/notifications/unsubscribe-auth/ABxXhizL25pfUyQVT8obAlXoi3iHVE4Nks5vKjRygaJpZM4ae6hd .

jeffreykegler commented 5 years ago

I changed the jogging indentation document to use "split" vs. "joined" instead of "split" vs. "flat". Also, on checking, it's looks like my so-called 0-jogging hoons are not jogging hoons. I've removed the references to 0-jogging from the document.

jeffreykegler commented 5 years ago

@ohAitch Looking back over the conversation, maybe your suggestion of "referent " is a good more specific term for "jogging hoon head"? And the "jogging hoon subhead" would be the "subreferent"?

jeffreykegler commented 5 years ago

Here's one problem from gen/musk that you mentioned:

== hoons/arvo/gen/musk.hoon 317:7 indent tallWuthep Jog kingside body @319:11; overindented by 2
      =+  [now=(cap axe) lat=(mas axe)]
      ?-  -.mask.bus
      ::  subject is fully blocked or complete
      ::
>       $&  ::  if fully blocked, produce self
            ::
!           ?^  blocks.mask.bus  bus
            ::  descending into atom, stop
            ::
            ?@  data.bus  ~
            ::  descend into complete cell

The line with the zap (!) is regarded as the body as a split kingside jog, and is overindented by 2. There are comments, which suggests a different indentation, but currently hoonlint do not honor these.

ohAitch commented 5 years ago

Right, this is the problem with saying that split kingside should be aligned differently from joined kingside(i.e. either ragged or aligned). The liberal thing to do is probably to say "proper split can be either outdented by 1 stop if queenside, or ragged/aligned regardless; improper split can be indented by 1 stop if kingside" - not sure how many false negatives this lets through.

Re: subreferent, that definitely makes sense for =: and % runes, for ? runes it's a pattern/type however.

On Wednesday, 6 February 2019, Jeffrey Kegler notifications@github.com wrote:

Here's one problem from gen/musk that you mentioned:

== hoons/arvo/gen/musk.hoon 317:7 indent tallWuthep Jog kingside body @319:11; overindented by 2 =+ [now=(cap axe) lat=(mas axe)] ?- -.mask.bus :: subject is fully blocked or complete ::

  $&  ::  if fully blocked, produce self

:: ! ?^ blocks.mask.bus bus :: descending into atom, stop :: ?@ data.bus ~ :: descend into complete cell

The line with the zap (!) is regarded as the body as a split kingside jog, and is overindented by 2. There are comments, which suggests a different indentation, but currently hoonlint do not honor these.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jeffreykegler/yahc/issues/4#issuecomment-461247854, or mute the thread https://github.com/notifications/unsubscribe-auth/ABxXhtY8DcBlV7oJzaTB8K1huLjoXQfhks5vK3fggaJpZM4ae6hd .

ohAitch commented 5 years ago

(Or rather, c-style kingside is something between "fully proper" and "lint error", at the same lint level as having joined queenside jogs or column-57 comments)

On Wednesday, 6 February 2019, Anton Dyudin antechno777@gmail.com wrote:

Right, this is the problem with saying that split kingside should be aligned differently from joined kingside(i.e. either ragged or aligned). The liberal thing to do is probably to say "proper split can be either outdented by 1 stop if queenside, or ragged/aligned regardless; improper split can be indented by 1 stop if kingside" - not sure how many false negatives this lets through.

Re: subreferent, that definitely makes sense for =: and % runes, for ? runes it's a pattern/type however.

On Wednesday, 6 February 2019, Jeffrey Kegler notifications@github.com wrote:

Here's one problem from gen/musk that you mentioned:

== hoons/arvo/gen/musk.hoon 317:7 indent tallWuthep Jog kingside body @319:11; overindented by 2 =+ [now=(cap axe) lat=(mas axe)] ?- -.mask.bus :: subject is fully blocked or complete ::

  $&  ::  if fully blocked, produce self

:: ! ?^ blocks.mask.bus bus :: descending into atom, stop :: ?@ data.bus ~ :: descend into complete cell

The line with the zap (!) is regarded as the body as a split kingside jog, and is overindented by 2. There are comments, which suggests a different indentation, but currently hoonlint do not honor these.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jeffreykegler/yahc/issues/4#issuecomment-461247854, or mute the thread https://github.com/notifications/unsubscribe-auth/ABxXhtY8DcBlV7oJzaTB8K1huLjoXQfhks5vK3fggaJpZM4ae6hd .

ohAitch commented 5 years ago

Perhaps "sea-grade" hoon would be a good way to describe "the style of hoon in the arvo codebase as currently deployed on urbit ships, and as I will default to writing indefinitely", as opposed to "space-grade" highly bureaucratic style being slowly established by Tlon Inc. "Leaks happen, try to make them few and small if you don't want to sink but there's a time(=money) tradeoff" vs "if you are missing a tile of heat-shielding you will die in a ball of fire"

Disclaimer: I am not a cosmonaut

On Wednesday, 6 February 2019, Anton Dyudin antechno777@gmail.com wrote:

(Or rather, c-style kingside is something between "fully proper" and "lint error", at the same lint level as having joined queenside jogs or column-57 comments)

On Wednesday, 6 February 2019, Anton Dyudin antechno777@gmail.com wrote:

Right, this is the problem with saying that split kingside should be aligned differently from joined kingside(i.e. either ragged or aligned). The liberal thing to do is probably to say "proper split can be either outdented by 1 stop if queenside, or ragged/aligned regardless; improper split can be indented by 1 stop if kingside" - not sure how many false negatives this lets through.

Re: subreferent, that definitely makes sense for =: and % runes, for ? runes it's a pattern/type however.

On Wednesday, 6 February 2019, Jeffrey Kegler notifications@github.com wrote:

Here's one problem from gen/musk that you mentioned:

== hoons/arvo/gen/musk.hoon 317:7 indent tallWuthep Jog kingside body @319:11; overindented by 2 =+ [now=(cap axe) lat=(mas axe)] ?- -.mask.bus :: subject is fully blocked or complete ::

  $&  ::  if fully blocked, produce self

:: ! ?^ blocks.mask.bus bus :: descending into atom, stop :: ?@ data.bus ~ :: descend into complete cell

The line with the zap (!) is regarded as the body as a split kingside jog, and is overindented by 2. There are comments, which suggests a different indentation, but currently hoonlint do not honor these.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jeffreykegler/yahc/issues/4#issuecomment-461247854, or mute the thread https://github.com/notifications/unsubscribe-auth/ABxXhtY8DcBlV7oJzaTB8K1huLjoXQfhks5vK3fggaJpZM4ae6hd .

jeffreykegler commented 5 years ago

Summary: This issue is getting long enough that it will be a challenge to go back over it to research individual issues. By my tally, the only remaining issue is the gen/musk, split kingside indentation issue we are currently discussing.

My inclination is, once that issue is settled, to close this ticket, without prejudice to issues raised in it being re-raised. I'll re-run the WUTHEP analysis, and open a new ticket with the remaining issues. Hopefully most of these can simply be added to anomaly.suppression. [ I renamed the mistakes.suppression file because "anomaly" is more polite. :-) ]

Getting away from the procedural stuff, I'll think about the split kingside issue and get back, maybe tomorrow.

ohAitch commented 5 years ago

I think zuse is clearly an actual mistake, lib/hood/drum could perhaps be an "anomaly"

At any rate, +1 on closing the issue given we've figured out what "kingside"/"queenside" mean at least

On Wednesday, 6 February 2019, Jeffrey Kegler notifications@github.com wrote:

Summary: This issue is getting long enough that it will be a challenge to go back over it to research individual issues. By my tally, the only remaining issue is the gen/musk, split kingside indentation issue we are currently discussing.

My inclination is, once that issue is settled, to close this ticket, without prejudice to issues raised in it being re-raised. I'll re-run the WUTHEP analysis, and open a new ticket with the remaining issues. Hopefully most of these can simply be added to anomaly.suppression. [ I renamed the mistakes.suppression file because "anomaly" is more polite. :-) ]

Getting away from the procedural stuff, I'll think about the split kingside issue and get back, maybe tomorrow.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jeffreykegler/yahc/issues/4#issuecomment-461264937, or mute the thread https://github.com/notifications/unsubscribe-auth/ABxXhotykSVIo3ArLDVdFSr2mQnmKghBks5vK4zygaJpZM4ae6hd .

jeffreykegler commented 5 years ago

As prep for the next (hopefully) last phase of the WUTHEP analysis, I am improving the diagnostics wrt context lines. Right now I highlight the jog head when there is a jog body problem, but you really want to see the rune line as well, but currently only one "background" line is allowed. I'm changing this to allow an arbitrary number of them. This makes it harder to orient within the file, so I will prefix lines with the line # as well.

Once that is done I am thinking of opening a new issue with the WUTHEP kingside lint output, and closing this issue in favor of the new one.

jeffreykegler commented 5 years ago

Replaced with issue #6