haskell-nix / hnix

A Haskell re-implementation of the Nix expression language
https://hackage.haskell.org/package/hnix
BSD 3-Clause "New" or "Revised" License
741 stars 114 forks source link

Parser: work on NAppDef & OperatorInfo related code #1047

Closed Anton-Latukha closed 2 years ago

Anton-Latukha commented 2 years ago

Thoughts during this PR:

Because grammar now is better formulated & becomes more obvious.

OperatorInfo has several ugly sides, where the functional application (which we can say always left associative), and unary operators (which do not have associativity property at all) still need to fake associativity field. In Binary, the NAssocNone is still associativity :rofl: (bidirectional) (this mistake drags at least from parser-combinators, which called associative infix operator non-associative, while not having associativity on binary operator is almost impossible to encounter, since not having associativity makes algebra not composable in a general sense of use).

I would like to have & use a better map of NAssoc to parser-combinators. For Infix* to not be present at all & implied through free abstraction in the source code.

Also the OperatorInfo is used in Pretty to change the operator information on the fly & pretend that operator associativity changes, I'd liked to have the pretty-printing the function application, unary and binary operations in an honest straightforward way.

I'd liked there to be the only datatype & data structure of the language operator grammar (NOpearatorDef), which would allow to fix above things.

(also do not know the current state of performance of GADTs in GHC.)

I think maybe the use of patterns (or functions) on NOpearatorDef would allow doing these.

Anton-Latukha commented 2 years ago

The proper solution is probably to have one value that represents the language grammar.

And then have a set of functions that on NOpearatorDef entry return the properties.

And then on top of that build N{Unary,Binary,Special}Op -> <property> retrieval functions.

Anton-Latukha commented 2 years ago

Well.

The work is mostly done.

The main side-effect of working on OperatorInfo removal was that hacks & dirtiness around it in {Parser, Pretty} code was cleaned up during it & results were refactored.

The funny thing is that now I arrived basically to OperatorInfo design again, but with more straightforward code. Seems like OperatorInfo may existed as such to have Pretty module & information sharing decoupled from Parser.NOperatorDef, so it allowed information transmission without reliance on NOperatorDef type structure. But also NOperatorDef structure was underdeveloped at the time (lacked precedence information). But that is not that good design overall either, because OperatorInfo is not attached to the object it represents, so it always required to be matched externally & because that was not done - the hacks in Pretty appeared, which code puzzled the reader (me), and while being OperatorInfo it was used to represent not only operators but all kinds of things.

The next question is: Nix.Pretty either way is a specialized code which is created to pretty print the particular language, and so it depends on the language, and so dependence on language NOperatorDef. Even if to go/think about further multipackage - the Pretty module probably would be difficult to reuse with big 3rd party changes in the language - that would change the pretty

soulomoon commented 2 years ago

Why stop using unless and for_, it seems pretty reasonable to use them in the original code.

Anton-Latukha commented 2 years ago

So people would learn to work/read with default Haskell functions traverse{,_} & sequenceA{,_} better.

Traversable is defined for traverse & sequenceA. All others are essentially aliases from them. map{,_,M}, for{,_,M}, sequence{,_,M} (that is 9 functions) are old naming for them & now are indeed aliases. And old names create a combinatorial explosion.

The combinatorial explosion creates a combinatorial explosion of knowledge requirement & mapping between those names & cognitive effort to process it.

Even writing rewriting rules for them requires Either grow combinatorial explosion exponentially, through composition lattice, OR just to do unification rewriting to default names & have standard consice rewriting lattice.

HLint has no equality intuition between aliases, and so, in long term - HLint would help only if default lattice rules match.

For example return now enters deprecation in GHC, and I have HLint PR that unifies return -> pure & that change alone allowed lattice to behave better (it started to discover new patterns).

And rewriting is mostly what I did up to this point here.

I myself forget that the default name is sequenceA and not sequence. It is helpful to have HLint giving suggestions.

Then learn all old & new intuitions & lattices. For new functional programmers & Haskellers it would be easier to learn the default Traversable traverse & sequenceA, they would have no intuition of an imperative for loop, which reverses control flow in Haskell (function application there gets reversed, for x f, instead of traverse f x).

Above rationalizations I apply to map{,_M}, for{,_M}, sequence{,_M}.

And I do such renames to form and stabilize style in the project.

As a result, style does not needs to be managed in the project at all. It allows to me to accept and be happy to take in work done in any style, I do not require adopting the style by other people, in reviews - for me the improvements are important. I can relax the review constraints & concentrate on improvements, because project code itself passively and politely nudges contributors towards a particular style if they can be nudged, but if they decided to contribute in their own style - I do know they decided so, for me that is enough to justify them contributing in their own style. I can obtain a bus factor myself & code form itself would teach people this style and show it is possible & works great, and without me would provide style maintenance passively continuously and would make people consider this design style for this project, and their own style and so for other projects for a number of years.

But it also may be true I went a bit far with essentially redefining the prelude for the project.

But most of these changes are probably sane to expect from the next default Prelude, or contribute suggestions into it. I also promised to move to the new default Prelude when it would be released.


unless rename.

Because I constantly need to remember how to read unless. And a have perception - a lot of other people also.

I also met & read other people noting that they have difficulty reading unless. But when (not exists) - needs no explanation or mental internalization of the name semantic implication.

It is also why I aliased not null as isPresent, since programmers are interested in inhabitants of a structure and that structure can be empty - is an unfortunate consequence of structures.

The additional question is why unless/when are bool (pure ()) <expr> <condition>, when they could been/already be based on pure mempty just equally. () is always Monoid, pure behavior is always transparent and by definition does not depend on the value in a, because of Applicative properties, so there is no reason not to use pure mempty for them.


All these style changes I did so far seem to align the code representation with the internal partial application sharing process (first default case, then general case), which seems to align with use of eliminators & control flow in Haskell.

Of course control flow in do blocks works differently. & inside them for & map may land & read better. But I am aware I do not use do blocks enough, currently I mostly internalized default lambda & functional design patterns in Haskell, still developing the style, do blocks sometimes effective way to create automatically optimized code, but I often prefer to compose results afterward & apply optimizations myself & get a new semantic understanding and simplified representation & denote functions in composition.


I am happy to have a dialog on any point or get additional reasoning points into a dialog.

Anton-Latukha commented 2 years ago

I also currently get to have thoughts about when{,True,False,Nothing,Just,Left,Right,Pure,Free, ... etc}. Besides I can be attacked that I unflipped the arguments in more lambda style order - that is indeed a schism, people would expect order as in base. Currently I am more and more aware that such single-hand functions imply too much & names tell too little, and overall seem to not feet Haskell. Because from when it is not distinct - do they imply pure (), Monoid a => pure mempty, mempty, id for default case - is not obvious from their name, and cases {mempty, pure mempty, id} - are all indeed get use. So at least distinct naming prefixes should be defined for such 3 cases, but also that is right on Fairbairn threshold & we start to ask are those when{,True,False,Nothing,Just,Left,Right,Pure,Free, ... etc} matrix even needs to exist or just use binary eliminators with {mempty, stub, id}, and so when becomes bool stub ... etc. That info is just my consideration and is not related to current PR.

soulomoon commented 2 years ago

Thank you @Anton-Latukha for the detailed explanation. Haskell does suffer the fact that a lot of function names points to the same or similar functionality. For me, I do not have a strong opinion upon these changes. My concern is purely for the looking of the code.

But overall, I suppose it is indeed a good idea to stick to a style.

Anton-Latukha commented 2 years ago

Yes. But that the source character size of arguments is different - is not an argument to suddenly introduce a 2nd function to do the same thing.

traverse does puts a larger argument in the middle. A lot of functions that have canonical Haskell argument order, & eliminators - do put larger argument in the middle. But that is also logical why it is larger - it is a function - it almost always would be larger than the argument for it.

For me this Haskell design requires Either being Ok with reading the large section, OR - if not Ok with having that section - formalizing the semantics of the section using denotation abstraction (a fancy way of saying - that should be a nicely named local function).

For me that, the function sectioning in the middle makes the code feels awkward to do - is a feature, it indicates that that section is above the Fairbairn threshold & so design nudges toward naming that code part properly & so as a result having more readable & understandable code through composition.

While for pattern - encourages to hang huge lambdas onto the tail & not thinking about how to name/abstract them, because why abstract the tail (while in reality that is a function that is a middle of control flow conveyour).

In both cases - (traverse, for) - the solution is to find a proper denotation for the function, but the g f a form directs towards doing it.

Indeed, sometimes I section things and like: "ugh, how to name this" - that is the intended property of the design.


At the same time, traverse / g f a form makes the function composable, as the composition of the function (generally) happens over the main path of the argument passing.

With the ability to be composed, and being the default name - HLint would pick up the rewriting simplifications for it if they would occur.


While initially learning Haskell I found the map{,_,M}, for{,_,M}, traverse{,_,M}sequence{,_,M}` superset deeply puzzling. And additionally found very puzzling the functions & code that reverses & does jumps in the control flow/function application/composition - it required time to be able to read them properly.

When I arrived at HNix, here was a number of z >>= d <&> a ?? h, as if someone tried hard to reverse the control flow in Haskell. It also spanned multiple lines & was hard to read, even how to move eyes through it was hard to determine. Example of why control flow is better to keep <- - it is easier for newcomers to read & understand.