haskell-nix / hnix

A Haskell re-implementation of the Nix expression language
https://hackage.haskell.org/package/hnix
BSD 3-Clause "New" or "Revised" License
752 stars 115 forks source link

Split parser & types out into different packages #328

Closed Profpatsch closed 3 years ago

Profpatsch commented 6 years ago

With the implementation of an evaluator and various utilities picking up steam, the dependency closure has increased substantially, which means the possibility of the package building e.g. on nixpkgs master has decreased considerably.

Growth of direct dependencies:

0.2.0: ansi-wl-pprint, base (>=4.6 && <5), containers, data-fix, hnix, parsec, parsers (>=0.10), text, transformers, trifecta, unordered-containers

0.3.2: ansi-wl-pprint, base (>=4.6 && <5), containers, data-fix, deepseq, hnix, parsec, parsers (>=0.10), semigroups (==0.18.*), text, transformers, trifecta, unordered-containers

0.4.0: ansi-wl-pprint, base (>=4.9 && <5), containers, data-fix, deepseq, deriving-compat (==0.3.), hnix, parsec, parsers (>=0.10), regex-tdfa, regex-tdfa-text, semigroups (==0.18.), text, transformers, trifecta, unordered-containers

0.5.0: aeson, ansi-wl-pprint, array (>=0.4 && <0.6), base (>=4.9 && <5), base16-bytestring, binary, bytestring, containers, cryptohash-md5, cryptohash-sha1, cryptohash-sha256, cryptohash-sha512, data-fix, deepseq (>=1.4.2 && <1.5), deriving-compat (>=0.3 && <0.5), directory, exceptions, filepath, hashable (>=1.2.4 && <1.3), hashing, haskeline, hnix, http-client, http-client-tls, http-types, lens-family, lens-family-core, lens-family-th, logict, megaparsec (>=6.0 && <6.6), monadlist, mtl, optparse-applicative, pretty-show, process, regex-tdfa, regex-tdfa-text, repline, scientific, semigroups (==0.18.*), serialise, split, syb, template-haskell, text, these, time, transformers, unix, unordered-containers (>=0.2.9 && <0.3), vector, xml

I suggest splitting the parser and maybe the types out into their own packages, while keeping all three here in one git repository. This way tools that depend only on the parser and don’t use the evaluator still work if an exclusive evaluator dependency breaks.

jwiegley commented 6 years ago

We will very likely perform such a split in the near future, after nixpkgs evaluates. If we do it too early, before there will be any real "customers", then we'd just be slowing down development for no real gain.

jwiegley commented 6 years ago

Marking this as blocked until we've completed initial development (i.e., we have a full, functional replacement for nix-instantiate). Then development can slow and we can look at architectural restructuring.

Ericson2314 commented 5 years ago

CC @taktoa who also wrote a Nix parser using megaparsec. If we can fold that and @peti's stuff used by cabal2nix into one surface syntax library, I think would be a good demonstration of code reuse.

Ericson2314 commented 5 years ago

Nevermind, I was confusing https://github.com/awakesecurity/language-ninja used in https://github.com/awakesecurity/ninja2nix .

Ericson2314 commented 5 years ago

https://github.com/peti/language-nix is @peti's surface syntax stuff, for the record.

jwiegley commented 5 years ago

We have a complete parser now, with a round-tripping pretty printer. Why would we need to fold in another parser?

Ericson2314 commented 5 years ago

There isn't. I'm thinking more language-nix deprecated, becomes a shim, and/or has a major version bump and becomes the name under which hnix's surface syntax stuff is published.

(I suppose I am contuously bugged but the amount of duplication on hackage and would like to reverse the trend anywhere there is no design beef. And cabal2nix would be a fine initial "client" of hnix's work given it's wide use and status.)

jmackie commented 5 years ago

Not sure what the status of this is, but I would really appreciate the expression types and parser living in a separate (lightweight) package. There have been a few occasions when I've needed only these things, but couldn't justify pulling in the entirety of hnix in order to get them. :+1:

jwiegley commented 5 years ago

At this point, I think things are mature enough to consider this split.

@jmackie When you say the expression types, do you mean just the types and not the evaluator or the implementation of builtins?

Ericson2314 commented 4 years ago

I would split things up, and also combine repos with hnix-store. Things are more mature, and there are more ways to fake loading multiple packages into ghci than before. (Beside stack's tricks, Obelisk, ghcide also do it, there's also progress on proper support in GHC.)

sorki commented 4 years ago

I'm now using https://github.com/sorki/hnix-overlay for integration work between hnix-store and hnix - it's pretty convenient so far and I'm not sure if it's actually worth merging these two as there are two more repositories needed - nix-derivation and nix-narinfo.

Splitting hnix would be nice thought, similar to hnix-store multiproject repository.

I might take a shot at this after the integration as I'm slowly becoming familiar with hnix codebase.

Ericson2314 commented 4 years ago

I would just merge in those other repos too? To me the critical thing is I can easily imagine making changes that effect many or all the libraries; it's much easier to synchronously commit to all of them with a monorepo.

We can re-split later if things become super-stabilized.

Anton-Latukha commented 4 years ago

Yes, the current evolve process comparison to super-stabilized is the key. Indeed it is easier to evolve, migrate & coordinate closely intertwined projects under one roof and then split them back once the APIs indeed stabilize into proper final form.

Ericson2314 commented 4 years ago

The GSOC I'm mentoring this summer is putting us on the path to true multi-package ghci/ghcide, and going great (props to @fendor!) so I am quite optimistic about splitting things into small packages without harming developer efficiency too.

jwiegley commented 4 years ago

I'm happy with breaking up hnix into multiple repositories at this point.

sjakobi commented 4 years ago

A split could also be useful for dhall-nix, which only uses the modules Nix.Atoms, Nix.Expr and Nix.Pretty (for prettyNix). So if Nix.Pretty could end up in the core package, that would be nice. I see that Nix.Pretty currently depends on Nix.Normal for dethunk – maybe that could be untangled?!

For now, here's a module graph, produced with

$ find src/ -name '*.hs' | xargs graphmod -q -r Repl -r Main | dot -Tpng -Gdpi=600 > hnix.png

hnix

It could probably be tweaked further, but I'm a graphmod and dot noob. :/


EDIT: BTW, here's somewhat similar discussion about the dhall package: https://github.com/dhall-lang/dhall-haskell/issues/1102

sjakobi commented 4 years ago

I'm happy with breaking up hnix into multiple repositories at this point.

Just a question for clarification: We're discussing splitting hnix into multiple packages that would still live in the same repo, right?

Ericson2314 commented 4 years ago

Yes, and furthermore I hope the hnix-store packages in the same repos.

Anton-Latukha commented 4 years ago

The solution to the closure size - is to do clean-up and refactoring.

For example, the CI of our project should check for unused dependencies and imports - that is all, really.

The closure size argument is contrary to the Haskell language, to the Haskell ecosystem cornerstones of modularity, and to the UNIX philosophy.


Main argument oh the header post:

" the dependency closure has increased substantially, which means the possibility of the package building e.g. on nixpkgs master has decreased considerably."

Is not true currently.

HNix in Nixpkgs, it builds fine.


Nixpkgs needs to have better package options management - that would decrease all closures.


I would be happy to have consolidated effort and merge-in the HNix-store for maintenance and refactoring of both projects at the same time. We need refactoring before splitting.

After John wrote the package there were no real refactoring of it.

Anton-Latukha commented 4 years ago

https://github.com/haskell-nix/hnix/issues/328#issuecomment-646250251

Yes. I am for refactoring/modularization of source files first.

Source files sometimes are >1000 lines long, modules can be splitted-up into smaller modules hierarchy, and reexports if we wish so.

I prefer to make module hierarchy more granular and have refactoring of the basic data types before any splitting into separate packages, it is easier.

sjakobi commented 4 years ago

Yes. I am for refactoring/modularization of source files first.

Source files sometimes are >1000 lines long, modules can be splitted-up into smaller modules hierarchy, and reexports if we wish so.

I prefer to make module hierarchy more granular and have refactoring of the basic data types before any splitting into separate packages, it is easier.

What's the motivation for splitting the modules into even more modules? I think that could also make it harder to navigate the project.

My understanding of the package-split idea is that it would be easier for users to use the parts of hnix they're interested in, without pulling in dependencies for other parts. That seems like a nice value proposition to me.

Ericson2314 commented 4 years ago

@Anton-Latukha I don't think closure size is bad per se.

The real thing is people should only pay for what they use. You can thinking of a packages of horn clauses: dep-interface-0 -> dep-interface-1 .... -> dep-interface-n -> new-interface. they are composed into the whole dependency graph. A good package is one where an item in new-interface "typically" depends on as much of dep-interface-0...dep-interface-n---i.e. i.e. whatever sub-interface of the package you actually use, you don't end up paying for dependencies you don't need. If you think of dependencies as capabilities or privlages, this relates also relates to the "principle of least privilege".

This is a nice design principle because it can be applied locally and concurrently to individual packages with the result being a a nice overall-ecosystem---following it constitutes distributed algorithm for ecosystem curation.

sjakobi commented 4 years ago

FWIW, some module splitting might be necessary to help untangle the module graph a bit. It's just not necessarily a good idea per se, IMHO.

Anton-Latukha commented 4 years ago

From my point of view the refactoring needed, since:

  1. I very agree with majority of #377
  2. Also it is too obvious NAtom is not fully formed.
  3. We did not look into the current Nix type system to draw experience for us from it.
  4. We not updated the Nix submodule for a long time, and Nix internals changed quite drastically, we can not afford to rely on old Nix verison.
  5. And we should break free from Nix submodule in the first place.

Guys, like, basic data model should be cleaned-up, refactored and consolidated first, we should not depend on the Nix. There refactoring and some code needed. When data model would form - yes, lets split project into submodules and then into packages.

Lets do the documentation and structure of it through the Haddock, then it would be way easier to split in into more modules using literate structural editing, we would clean-up imports in that process, the project would become documented and more understandable to everyone in that one movement.

Then when data types stabilize - module structure can be split into as many packages as needed, even every module can be a package if people want so.

sjakobi commented 4 years ago

@Anton-Latukha Apparently neither #377 nor this issue have seen much progress in recent months. I also don't see why one or the other ought to wait for the other one.

So if anyone wants to start implementing this one, I don't see a reason to wait.

Anton-Latukha commented 4 years ago

reduced

Anton-Latukha commented 4 years ago

reduced

Anton-Latukha commented 4 years ago

Ok.

TL;RD

  1. REPL (executable): 1.1. is not working, 1.2. needs MVP features.
  2. From after the creation - there was no refactoring of data type system. And for recursive schemes programming, it is logical that it needs post bottleneck refactoring to settle things down to leverage the paradigm fully. And it was already proven that refactoring is possible and needed. Refactoring of datatypes means interface breakages.
  3. There is a lot to clean-up and maintain in the project.
  4. For me for the project community growth is important, so it naturality leverages free evolution, that would exponentially increase quality and impact in the long run. For community growth - so simple onboarding, especially since the project already.

I am all hands up for splitting, but it is not logical at the moment, until types and executable would be ready to be independent packages.

Of course if John would move the agenda - I would not stand in the way. I just see it as impractical at the current moment, strategically.

Ericson2314 commented 4 years ago

I am busy with C++ Nix at the moment due to https://discourse.nixos.org/t/obsidian-systems-is-excited-to-bring-ipfs-support-to-nix/7375, so I won't be pushing this at the moment before it is convenient to you. (I do hope that will prepare me to return to hnix and contribute better than before after!)

jwiegley commented 4 years ago

@Anton-Latukha I have no agenda at the moment, so feel free to postpone this.

Anton-Latukha commented 4 years ago

Well, REPL could handle only one line expression.

With last work being done,

AFAIK Sorki made REPL accept multi-line expression. We would make a release soon.

When I test HNix REPL and the basics would work for people - we indeed can form an executable separately, my main argument from above is that executable and REPL should be MVP for basic needs before we ship it as a package, then overhead of additional package and interface can be reasonable.

But this month Peti is on vacation and he did not stated who to address, so because our default.nix ≠ HNix Nixpkgs - I prefer to keep package/Nixpkgs moves to a minimum this month or we can get into a situation when HNix broken in Nixpkgs until he returns, which would be a shameful state for me at least.

sorki commented 4 years ago

Regarding hnix-store merge neither me nor @shlevy thinks it is necessary and advantageous - we would prefer to keep the current state for now.

Anton-Latukha commented 3 years ago

I'd reformulated the question.

Since splitting the Types from the Parser seems on the face a good idea, but in reality - splitting out the data typeslways craves to drag/duplicate a bunch of logic code to take with it, and trying to splitting the types from logic all the time creates cyclic dependencies in Haskell.

And the type system and parser are very interdependent.

I'd reformulated the question as: "Split the executable from the library". People who want the library are not interested in executable, executable is pretty small and has haskeline and repline specific deps. BTW:

head post

... This way tools that depend only on the parser and don’t use the evaluator still work if an exclusive evaluator dependency breaks.

So seems that Profpatsch also meant that.


I worked/working on the CLI, and after derivationStrict roundtrip quest resolves - would settle on working on the CLI.

Looking at what HNix-Store-{Core,Remote}. Reality showed that Remote already struggled/struggles much with the HNix-Store infrastructure and had problems working with dev Store, making working releases.

What HNix-Store-{Core,Remote} has, and then adding HNix-{Core,Exec} infrastructure/coupling to it - the system is interdependent in development and in released code, which means that the projects need to have and distinguish the development line and release line of work&tests, CI for patches should run the integration test of the dev branches, and also test patches using published dep releases. And people tend to not like the partially green/red CI responses. If there was a solid way of railing the prerelease procedure and a pure way to separate the dev testing from the release testing.

But overall, splitting the executable while it has pretty minimal code, still needs a lot of work, and has minimal dep difference (only haskeline & repline) - that would not solve the dependencies very much, but would complicate the internal infrastructure maintenance and dev, currently, for me.

Anton-Latukha commented 3 years ago

All of the deps of the HNix seem logical to use.

Anton-Latukha commented 3 years ago

Well, looking at all logic in HNix - a lot of it really goes to the execution.

And from the map of imports, it looks like Expr is abstracted enough to be formed into a separate package after all.

Anton-Latukha commented 3 years ago

Used useful stuff, like module map - to the https://github.com/haskell-nix/hnix/wiki/Design-of-the-HNix-code-base,

And since splitting is very much an irreversible process, we must do all that is need to be done to prepare the spliting:

Since this thread become epic and went all over the place.

Maybe splitting would happen this year, realistically - first the private/public API as it is transparent, then module hierarchy change, it is already a big migration both here and downstream, which I would try to soften and automate. To simplify the discussion, reopening the thread anew.