haskell-nix / hnix

A Haskell re-implementation of the Nix expression language
https://hackage.haskell.org/package/hnix
BSD 3-Clause "New" or "Revised" License
741 stars 114 forks source link

Suggested improvements to NExpr #377

Open infinisil opened 5 years ago

infinisil commented 5 years ago

Maybe other improvements are possible too. I really dislike how some things are representable using the current one but never occur in parsing Nix.

Synthetica9 commented 5 years ago
jwiegley commented 5 years ago

Should NAttrPath take a dynamic boolean argument, rather than two constructors?

jwiegley commented 5 years ago

@Infinisil I'm rather open to PRs for these suggestions. Perhaps doing them one at a time would be simplest?

ryantrinkle commented 5 years ago

@Synthetica9 Nix has a primitive that retrieves source location for those things internal to the language, so they are semantically significant. The other source locations are used just for error reporting and such. It may be possible to unify them to some extent, but it's not quite as easy as one would hope.

sjakobi commented 4 years ago

https://github.com/haskell-nix/hnix/issues/666#issuecomment-650643370 is somewhat related.

Anton-Latukha commented 3 years ago

Just wanted to mention:

some things are representable ... but never occur in parsing.

Due to Nix paradigms we would never be able to parse everything just by reading the Expr level.

Some things can be made unrepresentable in Expr level, but not all.

To be more concrete - we need to gather examples what is unrepresentable in Nix to unrepresent it.

Anton-Latukha commented 3 years ago

Just wanted to mention:

some things are representable ... but never occur in parsing.

To formulate tersely: Expr may ("if we want") represent only literate Nix grammar. Expr by what it represents should allow any complete Nix grammatically correct schizophasia, well, 'cause schizophasia is still grammatically correct expressions.

Any (majority) of smart constructor logic checking, even as simple mas uniqueness of keys in attrset, - is Eval which is of course before any corporeal Exec.

So the threads talks about grammar strictly.

Anton-Latukha commented 2 years ago

@Infinisil

So, 2.5 years in the making. I arrived to it.

The major part of your proposition, several points of it depends on the - that binding key names should differentiate between static & dynamic, and a lot follows from that. Recently I've made VarName a proper abstraction, so now what asked would be easier.


I looked into the suggestion & so far have not understood the vector of them. Well, I understand the direction, I do not understand how abstractions align in the suggestion, there may be a typo around use of NKeyName (since proposition, iiac, is about splitting it into two separate types for Static & Dynamic.)


Inherit should be Inherit !(Maybe r) ![VarName] !SourcePos

VarName is an abstraction on Text (formerly was alias):

🅸 ~/s/n/pkgs:master◦>nrg 'inherit.*\$'
pkgs/applications/kde/default.nix
    inherit (srcs.${pname}) src version;

pkgs/stdenv/darwin/default.nix
    inherit (pkgs."${finalLlvmPackages}") compiler-rt;

So, inherit accepts dynamic keys also, they can not be put into the VarName (Text) abstraction, or otherwise someone has some secret insight why it should be so. That one seems false.


I would agree with this:

type StaticAttributes r = Map VarName (StaticAttributeValue r)
type DynamicAttributes r = [(NDynamicAttrPath r, r)]

The current is: https://github.com/haskell-nix/hnix/blob/f92a2838c02beb7e8563a97091af04cfa9ce1134/src/Nix/Expr/Types.hs#L289-L297

Separating statics from dynamics is a good idea & is a guide for further improvements.

infinisil commented 2 years ago

Inherit should be Inherit !(Maybe r) ![VarName] !SourcePos

VarName is an abstraction on Text (formerly was alias):

🅸 ~/s/n/pkgs:master◦>nrg 'inherit.*\$'
pkgs/applications/kde/default.nix
    inherit (srcs.${pname}) src version;

pkgs/stdenv/darwin/default.nix
    inherit (pkgs."${finalLlvmPackages}") compiler-rt;

So, inherit accepts dynamic keys also, they can not be put into the VarName (Text) abstraction, or otherwise someone has some secret insight why it should be so. That one seems false.

inherit (foo) bar baz;

is

Inherit (Just (NSym "foo")) ["bar", "baz"]

The Maybe r corresponds to the (...) part of inherit, not the VarName's.

Anton-Latukha commented 2 years ago

Thank you for correcting. That is true.

Anton-Latukha commented 2 years ago

Our discussion on (inherit): Checked, (almost) completed the migration:

By guys this was done in this low orbit ion cannon way, because the language allows:

> a = 3       
> {inherit "a";}  # with double qotes 
{ a = 3; }

there is a single case of its use inside Nixpkgs:

namespaces = map (type: { inherit type; }) [ "pid" "network" "mount" "ipc" "uts" ];

So there are indeed no dynamic keys inheritance, but there is this allowance to enclose a static key in "", which complicates treating it as a plain Text.

Removing the dynamic key here is a proper action, but now VarName processing need to be expanded, or have a new abstraction simply for this quirk, otherwise, that quirk requires to split the NString (DoubleQuoted | Indented) datatype, but if we go that pass, quirks would require all data types to be split 8)

It is not a feature. This is a quirk, unneeded language complication semantic & complicates the understanding of the type system to user (my exp) it is easier to refactor that line in Nixpkgs & reduce a bug.

We must remember our main audience, it is Haskellers, they would appreciate the well-formed type system in the language.

For HNix it is wiser & easier to "forget" (add the property of forgetful functor on these particular types of cases (quirks that are vacuous, rare & unused)) & just lint them out of existence (otherwise - get a feature report on it). Suggest users a cleaner code & give them the proper type (to guide (also ones who learn) through the language type system, as I know what it was like).


I moved the {inherit "a";} into syntax mistakes, with a note that it was put there deliberately, so if something there is alive traceback path.

Anton-Latukha commented 2 years ago

regarding

type DynamicVarName   r = (Antiquoted (NString r) r)
NAttrPath -> newtype NStaticAttrPath (NonEmpty (VarName)) | newtype NDynamicAttrPath NonEmpty (DynamicVarName r)
data StaticAttributeValue r
  = NamedVar [NKeyName r] r SourcePos
  | Inherit (Maybe r) SourcePos
newtype StaticAttributes r = StaticAttributes (HashMap VarName (StaticAttributeValue r))
newtype DynamicAttributes r = DynamicAttributes [(NDynamicAttrPath r, r)]
data Bindings r = Bindings !(StaticAttributes r) (DynamicAttributes r)

I started, but would finish a bit later - for obvious reasons. The data type change is massive, while also needs to land properly, types need to be chosen to keep instance inference & whole code needs to be refactored.

If I would receive a hunch on the data types to use - would appreciate it.

infinisil commented 2 years ago

It looks like Nix treats "str" and ${"str"} when possible as just str at the parser level:

$ nix-instantiate --parse -E 'a: { inherit ${"a"}; }'
(a: { inherit a ; })
$ nix-instantiate --parse -E 'a: { inherit "a"; }'
(a: { inherit a ; })
$ nix-instantiate --parse -E 'a: { inherit a; }'
(a: { inherit a ; })
layus commented 2 years ago

Yes, it has some simplification logic in the parser.

Which has an impact on semantics, because it make it tolerate one level of dynamic attributes, but fail on nested ones. 'a: { inherit "a"; }'is weird.

$ nix-instantiate --parse -E 'a: { inherit ${"${"a"}"}; }'
error: dynamic attributes not allowed in inherit at (string):1:14

See https://github.com/NixOS/nix/blob/e6150de90d8db101209fc6363f5f7696ee8192c4/src/libexpr/parser.y#L501-L522 and string_parts_interpolated.

Anton-Latukha commented 2 years ago

We can do it through additional abstraction & the very same coerce, coercing ever on these.

But why?

My current position

My current way is refactoring (clean-up & simplification) of the code. I am interested in how clean & how far HNix code can go implementing the alike preferentially transparent language. This thread topic is "improvements to NExpr", not "how much of quirks we can duplicate". The quirks that need to be duplicated would show themselves during work. *these syntax quirks is what created deep puzzlement in me & strongly slowed the learning of the language. Wondering on every corner on these types of strange semantic inflaxions. So `inherit` takes what? Strings, quasiquoted variables - what for ("I seem to not know something"), & that "I seem to not know it" "I seem to not know why", "I seem to not know something about it" haunted & still haunts me about Nix. "Aha - you did not expected this! - It is you fault." No, it is not my or users fault. So since `inherit` in 99% of cases is a key - I would recommend keep it & expect a key, & ask to supply a key, until someone comes around & gives a genuine argument, that is not "it was done under a blanket in this way years ago, so you must follow", but a genuine example why that semantic is needed. Keeping the language semantics simple & clean is a core piece to have a great productive language & tooling. For example - less code, simpler code, precise type inference. Keeping language semantics in check is especially important when we are having very limited resources. In other words - I am happy with the decision to put proper clear use & in the cases where the 99% of use is this way - we need to not imagine & give the `otherwise` check case to the reality. Not every time I decide to reduce, for example I put a lot of effort in carefully preserving the `replaceStringsNix` quirks during refactoring. But even there, sometimes I think maybe I needed to bite the bullet & tell people that: ```nix builtins.replaceStrings ["" "e"] [" " "i"] "Hello world" " H e l l o w o r l d " ``` Is definitely not "a feature". If we'd to reduce it - the `replaceStrings` simplifies drastically & from some "zygohistomorphic prepromorphisms" (with complex corecursion) it turns into a simple catamorphism.

In short - those are obviously not features, those are bugs that are better (in an ideal situation) to not be there. "A feature" that is not used at all is generally a misfeature and can be reduced (not in terms of GNOME degree of feature removal). We may give users a service improving the language a bit here and there where possible, where language & its use gives way. So we need to think if replicate cargoing-in a particular bug is worth it, or it is worth looking at it realistically & abstain from replication.

A good software design is to postpone this type of (especially this particular) decision until would be necessary to make a decision (if ever - reality would ask for it to be implemented). Haskell allows that path very much.

Anton-Latukha commented 2 years ago

@Infinisil @layus @jwiegley

One of the points of the head post suggests using interleaved strings.

While I remember & understand something on that, I need to read more about it.

Topic of interleaved strings - is impossible to enter through websearch, since there is a classical programming assignment of deinterleaving some strings which shadows over & by volume completely dwarfs the topic of practical use of interleaved string data type in the language design theory (for quoting & etc).

Since we have knowledgable people here on the topic. Can someone link to some good materials on it, maybe books also?

sjakobi commented 2 years ago

Topic of interleaved strings

I'm not aware of any write-ups, but dhall uses a representation that is equivalent to @Infinisil's suggestion:

If you grep for TextLit or Chunks you'll find the various functions that handle text literals in dhall.

Anton-Latukha commented 2 years ago

Thank you for the pointer to the code.

...

It is "string interpolation", I'm saying I remember that it is being a general language design.

sjakobi commented 2 years ago
  • Because they are not Interleaved.
  • They are Interpolated. Quite a difference.

Can you clarify this point? My understanding is that these concepts are simply related in the way that a list of strings, interleaved with expressions, is a reasonable way to represent interpolated strings.

sjakobi commented 2 years ago
  • Because they are not Interleaved.

Also, what does "They" refer to?

Anton-Latukha commented 2 years ago

Also, what does "They" refer to?

To what is said/suggested (strings) in the head post.

It was said strings are interleaved. But the proper word is interpolated. The wrong term tripped me up & I spent some time unsuccessfully looking to refresh the information on it, in that time & making this errata & it complicating the process, otherwise feature already would be implemented.

But that is Ok, it is understandable knowing that much to mix-up such close wording/terms. & it would be understandable to overlook the difference between them.

wrapper for "difference" topic

There is a difference etymologically & semantically to them I did not want to drive attention to or go into an explanation. Terms are close but not interchangeable in CS & math fields, discrete function does not gets "interleaved" but gets "interpolated", because there is a difference in terms. To "interleave" discrete function would be to somehow make its values product values, or lay two discrete functions with the same discretization on top of another, with a 1/2 shift between them on the abscissa. Interleaved seems to be generally used to relate to having one source material, or to twine (parallel something) together or to embellish. Origin - is to put blank pages into a book (& pages connected in the root & to annotate stuff). Book material can be interleaved with the blank note pages, but it is not interpolated with note pages, the semantic of the book is not altered by adding blank pages to it. For example, HNix in `Annotated` interleaves core data type with annotation to it & module is currently completely abstract to what annotation is or how it is used, it can be anything. Embellishment is isomorphic to the initial universal construction. Interleaving is generally considered/implied to be effectless. Haskell example: `foldr` interleaves phases of traversing & evaluating, so it short-circuits, but in mathematics where time does not exist - the result of `foldr` that interleaves phases is equal to one that would not (or to say to `foldl` if the operation is both left & right associative), latter one would not halt in infinite structures, but its result is decidable & is equal to interleaved version. Interpolation is a synonym to altering to receive a different end result. Example: string type that alters own text part basing on the instruction of the language in the part that interpolates. Interpolation is not required to be information/structure-preserving. Recursion generally does not interleave itself, the mutual recursion interleaves two functions (*zygomorphism*). Recursion can be said to interpolate on itself, since every computation pass determines the result of the next pass, or if it would actively change own code/behavior through interpolation on itself - that would be a complex recursivity case (*histomorphism*). The term of interpolated string - is a strong one in CS, it is hard to argue about it, thankfully it is essentially a well-put & decided term. In short, interleaving - to augment, embellish, or have parallelism of two processes, interpolation - is essentially a metaprogramming. Interpolation of stings in languages are tiny DSLs to alter the string type to arrive at the end result through metaprogramming. When adding interleaving (giving the ability to regularly reflect) to the metaprogramming on the code - it would be to give dynamic programming to the code. A minimal example would be a Typeable type class, GHC would interleave compiled code with runtime typecheck promise & type inference dynamically at run time, which would through picking instance allow dynamically altering the behavior of the code at runtime. At least it is how I understood the difference in terms looking at their etymology & in what topics they are used.
Anton-Latukha commented 2 years ago

I agree with the headpost suggestion

Instead of function application being NBinary NApp f arg, it should be NApp f arg

For the time I was evaluating this. Currently, after looking at the code I definitely agree.

Currently every functional application does: NExprF -> NBinaryOP -> NApp Where NApp is the very last constructor of the NBinaryOP. So, the main operation in FP - functional application, is the last one to be arrived to. It definitely looks like it would be efficient to make it first in NBinaryOP and probably take it into NExprF itself.

jwiegley commented 2 years ago

I agree with create a new constructor in NExprF that represents function application.

Anton-Latukha commented 2 years ago

I agree with create a new constructor in NExprF that represents function application.

Done.