haskell / cabal

Official upstream development repository for Cabal and cabal-install
https://haskell.org/cabal
Other
1.62k stars 697 forks source link

[Initiative] Improve Cabal documentation structure to become more beginner-friendly #9214

Open malteneuss opened 1 year ago

malteneuss commented 1 year ago

Additional context Haskell's development tooling has matured a lot in the last years. One of the important areas for improvement to me is documentation.

What is wrong with the docs? Having recently switched from Stack back to Cabal i struggled with finding examples and explanations for parts of a .cabal file, typical use cases and wordings. I'm convinced that this has to do with the overall documentation structure. I propose to introduce a clear(er) division between tutorials, guides and reference and explanations as described in https://documentation.divio.com/ and to follow a structure similar to the documentation for Rust' package manager cargo: https://doc.rust-lang.org/cargo/index.html.

The issues i see

If you also see the need to improve the documentation (and what and what else to do first and how), feedback is welcome. I started with a small improvement in #9212.

lsmor commented 1 year ago

Recently, as part of Summer of Haskell, I wrote a little cabal overview which I think covers (partitally) many points of this issue. Consider the information below as part of an informal conversation, so many things aren't official wrt cabal documentation. We may use it as a reference a build from here

A brief overview on cabal

A cabal project has many items ordered in a hierarchy.

Top level: Project

Level Two: Package

Level Three: Component

Level Four: Module

Summary

This is a visual of the hierarchy

Project
|- package-one
   |- library
      |- ModuleOne
      |- ModuleTwo
   |- internal-library
      |- ...
   |- test
      |- ...
   |- ...
|- package-two
   |- executable-one
      |- Main
      |- ...
   |- executable-two
      |- ...
   |- benchmark
      |-...

Example

This is an example of a complex project. Imagine a file manager which can work remotely. In this project you'd have

The file structure could be

hs-filesystem
 |- app
    |- FakeFS.hs
 |- src
    |- FileTree.hs
    |- ...
 |- test
    |- FileTreeSpec.hs
    |- Main.hs
 |- hs-filesystem.cabal

hs-filesystem-server
 |- bench
    |- Main.hs
 |- app
    |- server.hs
 |- server.cabal

clients
 |- src
    |- Common.hs
 |- cli
    |- Main.hs
    |- ...
 |- gui
    |- Main.hs
    |- ...
 |- tui
    |- Main.hs
    |- ...
 |- filesystem-client.cabal

cabal.project

Below, how the files look alike

The cabal.project file

packages: ./hs-filesystem
          ./hs-filesystem-server
          ./clients
.
.
.

The hs-filesystem.cabal

-- this is the name of the public library it is a top level value because a .cabal file has at most one public library
name: hs-filesystem 

-- You must indicate the folder code is.
-- list all dependencies and modules exposed
library
  hs-source-dirs: src/
  build-depends: ...
  exposed-modules: ...

-- Notice, you depend on the compoment hs-filesystem!!
-- the source files are in the folder test
-- Because is a runnable component, you need to specify the entrypoint
-- runnuable components may have other modules aside the entrypoint.
test hs-fs-tests
  build-depends: hs-filesystem
  hs-source-dirs: test/
  main-is: Main.hs
  other-modules:
    FileTreeSpec.hs

-- Notice, you depend on the compoment hs-filesystem, but not depend on tests.
-- Of course, you don't need to build test to build the executable
-- Also, the entrypoint file is name FakeFS but inside the file the header must be `module Main where`
-- When building with cabal it will create an binary executable fake-fs you can run ffrom the console

executable fake-fs
  build-depends: hs-filesystem
  hs-source-dirs: app/
  main-is: FakeFS.hs

The filesystem-client.cabal

-- notice you are creating a library component which depends on a a different package library (hs-filesystem) all within
-- the same project. This library is internal, hence it has a name tag.
-- Compare it with the public library hs-filesystem.cabal in which the name is top level.
library filesystem-client
  hs-source-dirs: src/
  build-depends: hs-filesystem
  exposed-modules: Common.hs

-- The executable component depends on both:
--    hs-filesystem (public library define in other .cabal file) 
--    filesystem-client (internal library define in this very .cabal file)
-- Notice that transitive dependencies do not apply. If you want to use a function from hs-filesystem
-- you must make it an explicit dependency
executable filesystem-cli
  build-depends: hs-filesystem
               , filesystem-client
  hs-source-dirs: gui/
  main-is: Main.hs

executable filesystem-tui
  build-depends: hs-filesystem
               , filesystem-client
  hs-source-dirs: tui/
  main-is: Main.hs

executable filesystem-gui
  build-depends: hs-filesystem
               , filesystem-client
  hs-source-dirs: gui/
  main-is: Main.hs

when building filesystem-client.cabal it will create three binary executables.

malteneuss commented 1 year ago

@lsmor Nice. This looks like good structure for a chapter about disambiguating typical packaging terms.

andreabedini commented 1 year ago

This is great!Thank you for taking the lead <3. Few hot comments while reading the posts above. None of them might not be relevant at the pedagogical level but better to be on the same page about some technical details. Feel free to AMA if something is not clear and/or correct me if I have made mistakes.

Define an official name for multi-package setup

This is definitely called a cabal project.

beginners and average Haskellers probably don't know or care about NixOS

100% this. As much as I am care deeply about making cabal and nix get along, cabal documentation should be about cabal not nix. I understand nix integration is also being deprecated as it has not been working with v2 commands (introduced quite a while ago now).

Move Setup.hs to separate (legacy?) chapter.

Setup.hs should be relegated to a sepeate section for niche features (i.e. custom-setup) and not even mentioned otherwise. It's not even necessary to specify build-type: Simple these days. It's the default.

Any flag you pass to cabal, can be written in this file

Not all of them but yeah. The reference specifies, for each cabal.project option, the corresponding cli flag (if there is one).

Also, In this file you specify all packages you want to build A project may have one or more packages

Yes. Here are some excessively-detailed notes:

Dependencies

  1. In first approximation, all dependencies have to be built as well.
  2. Some depedendencies will be cached in the cabal store; meaning: we built the exact same package once so we don't need to build it again. "Exact same" here is determined by hashing by all dependencies (through their own hash), flags, compiler version, and some build parameters. See cabal-hash.txt in the cabal store, and Distribution.Client.PackageHash. This requires having already decided all the dependencies and indeed happens after generating a build plan (see below).
  3. Some dependencies can be chosen among pre-installed packages. tl;dr: GHC and cabal communicate through package databases (packagedb). Two are "well-known" and others are custom. The global packagedb comes with GHC pre-populated with a set of "boot" packages. The user packagedb, if ever used, lives somewhere in your home. Other packagedbs can be listed in cabal.project. If you use nix-style builds (using v2 commands, which has been the default for a while) you don't need to think about this but cabal-install's solver does try to reuse pre-installed packages when it can. This is very different from the above mechanism, since it is part of generating an install plan and (currently) only the package name and version are taken into consideration.

Local packages

There's a distinction between local packages and non-local packages. See findProjectPackages. Local packages all the packages directly mentioned in cabal.project: packages:, optional-packages:, extra-packages, source-repository-packages. I believe this is decently documented in the reference.

Targets

When you do cabal build xyz, cabal-install jumps through a bunch of hoops to figure out what exactly you mean by xyz. Target forms can point to any (component of) any package in the build plan, not only local packages. E.g. you can add in cabal.project options for a package abc in your depedency tree and rebuild just that package with cabal build abc. Also, you can list multiple targets.

This file is not mandatory. But it is if you one more than one package

Yes. This works like this: if cabal.project is not present then use the default cabal.project which ispackages: .. This mean that in the single-package scenario, you need to add project configuration options tocabal.project.local(totally fine having acabal.project.localwithout acabal.project). This is indeed whatcabal configuredoes, it turns cli flags to cabal.project options. If you were to create acabal.projectwith some options, you'd need to remember to putpackages: .` as well.

FWIW some people wish they could pass packages: from the cli, which would make cabal.project not mandatory also with multiple packages. I am not opposed to the idea TBH.

A package is the minimum item buildable with cabal

You can actually build only selected components (see targets above). Cabal will take into account the component dependencies too (e.g. exe depends on some libs).

Nevertheless a "cabal package" is the unit of distribution.

A cabal package is described by a package description file, commonly known as a "cabal file".

A package may have one or multiple components but only one public library (this isn't true anymore, I think)

Correct, as of Cabal 3.0 (2019!) you can specify visibility: public in a sub-library. The default is always priviate for sublibs and implicitly public for the main library. Note that the solver doesn't understand public sublibraries very well and will never choose a pre-installed on (see no. 3 in "Dependencies" above).

Also, it's worth noticing that the solver operates at package level in the sense that the version bounds on all (~ sort off, see below re: tests and bechmarks) components dependencies are grouped togheter and there cannot be cycles between components in separate packages (e.g. pkg-b:lib depends on pkg-a:lib, pkg-a:exe depends on pkg-b:lib). You can build this manually with Cabal but cabal-install's solver will reject it like it was "pkg-b depends on pkg-a, pkg-a depends on pkg-b".

There are four kind of components divided in two groups

There's a fifth, foreign-libraries. You can build a haskell library to be linked into non-haskell code.

Runnables / non-runnables

I never heard this terminology. Maybe executables and libraries could be a simpler option? tests and benchmarks are executables just like exes. You can cabal run them like an executable (in addition to cabal test/cabal bench).

Also, the user guide makes a bit of a mess with the terminology "internal/private/sub". There's even a reference to (quotes) "private internal sub-library" :joy: Someone is proposing the pov that they are all libraries all the same, just one has the same name as the package name and you don't need to write it. TBH I don't have strong opinions here, as long as the terminology is consistent.

internal libraries are used to share code between components in the same package but not with other packages

Unless they are made public.

Other notes (for what they are worth):

This is an example of a complex project.

I suggest we frame this as "something something ... cabal for projects". There are separate considerations to make if you want to publish a package. The current user guide tends to lean toward package development (roughly writing libraries to publish that other people can build, rather than writing project to build so other people can run).

In this project you'd have

Other things that might be worth adding (perhaps one a the time with some narrative?)

The elephant in the room of course is backpack.

Ok, I accidentally a book. Happy to chat if you like.

malteneuss commented 1 year ago

@andreabedini Thanks for your hints. A few things became a lot clearer to me, e.g. why the word "cabal project" makes sense and why there is a "Package description" chapter (i didn't see the 1-1 correspondence to a .cabal file before xD). You mention a lot of important topics for specialized guides. For now, i would like to focus on the top level structure, the intro and a few sections a lot of users will read or look up. As soon as #9212 is merged, i can start with the top-level structure and re-organizing the content that's already there.

BinderDavid commented 1 year ago

In the past week I have also been looking at ways to improve the cabal user guide. Here are some of my assorted thoughts:

I propose to introduce a clear(er) division between tutorials, guides and reference and explanations as described in https://documentation.divio.com/

I think this is a problem of the current state of the user guide, which can and should be fixed early. I think a clear first step would be to use the Sphinx feature of "parts", which allows to split the table of contents into several separate parts. (I.e. this is just typography) A simple suggestion would be to use the two parts "User Guides" and "Reference", and to triage the existing documentation into those two parts.(Edit: I see you suggested exactly that :+1: )

Move Setup.hs to separate (legacy?) chapter.

I think the Setup.hs chapter is only one example of information that is displayed too prominently. There are currently 12 toplevel sections, and since these are always visible in the table of contents they should correspond to the 12 most important anchors which allow to navigate the users guide. In my opinion this is currently not the case. For example, the last 5 toplevel sections are for niche usecases only, and the user guide shouldn't spend the most important toplevel sections to refer to them.

By contrast, some subsections contain the most relevant information and are difficult to find because they are hidden several subsections deep.

liamzee commented 1 year ago

@malteneuss, if I can be your runner boy, I'd be happy. I've taken the task on of trying to work on the Wiki documentation myself, as well as aiming (and I hope I can be successful here) to help document the existing codebase.

The "Complaints and Grievances community" brought up issues with tooling, and your initiative seems the lowest hanging fruit.

malteneuss commented 1 year ago

@liamzee Great to have your support and thanks for improving the Wiki. I'll come back to you when the top-level structure is settled.

BinderDavid commented 1 year ago

I am currently attempting a rewrite of parts of the introduction which introduce general packaging concepts. (My attempts are on this branch here: https://github.com/BinderDavid/cabal/tree/rewrite-user-guide-introduction I haven't opened a PR yet and am still trying to figure out how to structure things).

Concretely, I am looking at ways to improve section 2 (Introduction) and section 3.2 (Package concepts and development -> Package concepts). I think the main issue that can be improved is that they introduce the cabal packaging system by comparing it to distribution package managers (like rpm/debian) and to GNU style building with autoconf/configure/make. This can be explained by the fact cabal was one of the first programming languages which introduced this style of packaging and handling dependencies. But I think for a new programmer coming to Haskell today it would be more useful to compare to other similar systems like Rust with cargo+crates.io or Javascript with npm.

@malteneuss You are currently working on #9212 . After that is merged, I think it would be useful to look at section 2.1: Package concepts and Development -> Quickstart. As far as I can see that section has more or less the same content as the Getting Started section: How to initialize a new app, how to add a dependency, how to run the program. We could compare what information is contained in sec 2.1 that is not in the Getting started section, and move this information to the Getting started section (I don't think it is much). Afterwards section 2.1. could be removed as redundant.

ulysses4ever commented 1 year ago

Hope some prior work could be used as a source of inspiration: https://github.com/haskell/cabal-userguide

andreabedini commented 1 year ago

:exploding_head: why is that in a separate repo!?

ulysses4ever commented 1 year ago

@andreabedini it's abandoned now so I don't think it matters much, but the reason was, I believe, is that the current manual was deemed unsalvageable by the authors of that initiative.

BinderDavid commented 1 year ago

@andreabedini it's abandoned now so I don't think it matters much, but the reason was, I believe, is that the current manual was deemed unsalvageable by the authors of that initiative.

It have taken a look, and they put a lot of work into developing a nice global structure for how the documentation of cabal should be structured. But, as far as I can see, only one chapter of this new structure was finished (unless I am missing some work on branches that I haven't checked out).

My impression is that the cabal documentation is not unsalvageable, but what it does need is aggressive editing. It is always simpler to edit or add just a single paragraph or subsection of the docs than touching the overall organization, deleting material and merging and moving sections. Also, editing can be done piecemeal, and it is less likely to run out of steam than a complete rewrite. But I think there was a hesitancy to edit older material, and instead only new information was added. This led to the current state which is a bit lacking in focus and structure.

ulysses4ever commented 1 year ago

If you feel like doing piecemeal, go for it: everyone (myself included) will thank you. But I personally think until the global structure is improved along the lines described in that repo, people will hardly notice your effort. The reason I think that is that it seems to me that it's completely impossible to navigate it without knowing a lot about Cabal already. For experts it kinda works fine already I'd say, but to make it novice-digestible, structural changes are necessary. Just my 2c.