quasicomputational commented 6 years ago

57 will help a lot with cutting down cabal-to-dhall's verbosity, but I've realised that it probably won't be enough for larger projects.

Consider lens as an example of a moderately large project that's likely to be typical: the library section has nine conditionals, about fifty exposed modules, and about twenty dependencies; each conditional only affects one or two fields. If we naively duplicate the library section for each conditional, that's a lot of bloat: it'll be about 2^9 * 70 ~ 35k lines!

Better would be to factor out the common parts between branches and then push the conditionals into the fields themselves. That is, rather than:

  library = \ (config : Config) ->
    if minGHC config "8.0"
    then defaults.Library
     // { exposed-modules = ...
        , build-depends = ...
        }
    else defaults.Library
    // { exposed-modules = ...
       , build-depends = [ deps.generics-deriving ] # ...
       }

we'd generate code like

  library = \ (config : Config) ->
        defaults.Library
     // { exposed-modules = ...
        , build-depends =
            ( if minGHC config "8.0"
              then []
              else [ deps.generics-deriving ]
            ) # ...
        }

I think this might need a heuristic for when it's not helpful (little in common between the branches), but maybe that'll be so rare that it wouldn't be worth the bother.

quasicomputational commented 6 years ago

I had a further look at this and I concluded that we probably need an intermediate type; something like

data IfThen a b = IfThen Condition a b

data ConditionList a = ConditionList [a] (IfThen (ConditionList a) (ConditionList a)) [a]

data LibraryIntermediate =
  LibraryIntermediate
    { otherModules :: ConditionList Expr
    ...

This has the big advantage that we won't need to diff lists and extract common prefices/suffices from 2^n possibilities; instead, we can build this intermediate representation directly from the Cabal file's condition tree. This also means we won't need to generate all 2^n possibilities in common cases, which is possibly significant for larger Cabal files.

The problem with this is that we'll need to re-implement Cabal's field-combining logic, which is its own brand of weird (see haskell/cabal#3313), and, for things that aren't lists or records, we'll wind up back with 2^n combinations to consider. On the gripping hand, I expect that to be very exceptional, and I'm not sure it's possible to do better.

ocharles commented 6 years ago

I think this one can be done purely in Dhall Exprs. Basically if x then { record1 } else { record2 } can be changed to pushing the conditional down into the fields of both record1 and record2. Dhall knows how to normalise if x then a else a into just a, so unnecessary conditionals will vanish simply through normalisation.

quasicomputational commented 6 years ago

A big motivation is getting rid of the 2^n truth tabulation of conditionals, even as an internal step. I've had a quick look for the most conditionals in a component and the 'winner' was Agda, with 15 in its library section, which is actually manageable providing that that's the worst out there.

dhall-lang / dhall-to-cabal

Factor cabal-to-dhall's conditionals #59

57 will help a lot with cutting down cabal-to-dhall's verbosity, but I've realised that it probably won't be enough for larger projects.