Open quasicomputational opened 6 years ago
I had a further look at this and I concluded that we probably need an intermediate type; something like
data IfThen a b = IfThen Condition a b
data ConditionList a = ConditionList [a] (IfThen (ConditionList a) (ConditionList a)) [a]
data LibraryIntermediate =
LibraryIntermediate
{ otherModules :: ConditionList Expr
...
This has the big advantage that we won't need to diff lists and extract common prefices/suffices from 2^n possibilities; instead, we can build this intermediate representation directly from the Cabal file's condition tree. This also means we won't need to generate all 2^n possibilities in common cases, which is possibly significant for larger Cabal files.
The problem with this is that we'll need to re-implement Cabal's field-combining logic, which is its own brand of weird (see haskell/cabal#3313), and, for things that aren't lists or records, we'll wind up back with 2^n combinations to consider. On the gripping hand, I expect that to be very exceptional, and I'm not sure it's possible to do better.
I think this one can be done purely in Dhall Expr
s. Basically if x then { record1 } else { record2 }
can be changed to pushing the conditional down into the fields of both record1 and record2. Dhall knows how to normalise if x then a else a
into just a
, so unnecessary conditionals will vanish simply through normalisation.
A big motivation is getting rid of the 2^n truth tabulation of conditionals, even as an internal step. I've had a quick look for the most conditionals in a component and the 'winner' was Agda, with 15 in its library
section, which is actually manageable providing that that's the worst out there.
57 will help a lot with cutting down cabal-to-dhall's verbosity, but I've realised that it probably won't be enough for larger projects.
Consider lens as an example of a moderately large project that's likely to be typical: the library section has nine conditionals, about fifty exposed modules, and about twenty dependencies; each conditional only affects one or two fields. If we naively duplicate the library section for each conditional, that's a lot of bloat: it'll be about 2^9 * 70 ~ 35k lines!
Better would be to factor out the common parts between branches and then push the conditionals into the fields themselves. That is, rather than:
we'd generate code like
I think this might need a heuristic for when it's not helpful (little in common between the branches), but maybe that'll be so rare that it wouldn't be worth the bother.