haskell / cabal

Official upstream development repository for Cabal and cabal-install
https://haskell.org/cabal
Other
1.62k stars 697 forks source link

Apply codensity to Progress #3640

Open ezyang opened 8 years ago

ezyang commented 8 years ago

CC @dcoutts, @grayjay, @kosmikus,

The Progress type in Distribution.Client.Dependency.Types is a great idea and more code in Cabal should use it. However, I am a bit worried about the asymptotic complexity with left associated binds.

Progress is an instance of the free monad over:

data ProgressF step fail a = Step step a
                                                | Fail fail

As a reminder, the free monad construction is:

data Free f done = Free (f (Free f a)) | Done done

Inlining ProgressF step fail into f, we have:

Free (ProgressF step fail) done = Step step (Free (ProgressF step fail) done) | Fail fail | Done done

which is the current definition of Progress.

It is known that free monads have trouble with left associativity, see http://comonad.com/reader/2011/free-monads-for-less/ and http://blog.ezyang.com/2012/01/problem-set-the-codensity-transformation/ so instead we want to apply Codensity to avoid this problem.

So, I would like to apply the Codensity transformation to ProgressF. However, there are some questions:

  1. Does Codensity retain the incrementality that the original Progress had?
  2. Should we define ProgressF, Free and Codensity, bring in a dependency to get them, or inline all the data type declarations (just as Progress is the inlined version of Free ProgressF)
grayjay commented 8 years ago

That sounds like a good idea, though I'm not very familiar with Codensity. IIUC, it would preserve the incremental behavior. I'm not sure how it would work with the existing uses of Progress, though. We could split it into two types if necessary. How would you use it in Cabal?

I took a look at how cabal currently uses Progress, and there is actually only one use of bind, in D.C.Install. However, the solver combines Progress logs in another way that could probably be improved. It repeatedly replaces one log's failure value with the next log in the backjump function. backjump is called when the solver converts the whole search tree into a log. It combines the logs from a node's children using foldr, but I think that there is still left associativity between, say, the logs under the first node at level n and the rest of the log produced at level n. Since the search tree can be very deep, that left associativity could impact performance. It would be nice to be able to update the failure in constant time. I'm not sure how to design a log where both success and failure values can be updated efficiently, though. Do you think that we could handle both use cases with the same progress type?

ezyang commented 8 years ago

IIUC, it would preserve the incremental behavior.

Looking carefully at a paper @ekmett pointed me to http://okmij.org/ftp/Haskell/zseq.pdf it seems that Codensity will work poorly only if you need to make some observations, and then continue building the progress log. Just briefly glancing at the Cabal code, it does seem that we are pattern matching on Step so it may be that we are destructing and building.

As for your second question, I honestly have no idea how to implement backjumping better. It feels like the academic literature should have an answer for this, but I haven't grokked it sufficiently.

How would you use it in Cabal?

The proximal case was there was some code in Distribution.Simple.Configure which is effectively pure, but lives in the IO monad so that we can log and error. It doesn't take too long to run, but Progress is arguably a nice and convenient way to implement a monad that logs and errors.

grayjay commented 8 years ago

Thanks for the link; that looks useful.

Just briefly glancing at the Cabal code, it does seem that we are pattern matching on Step so it may be that we are destructing and building.

Yes, Log.hs and Message.hs both pattern match on the messages and modify them, so it would be hard to avoid traversing the whole log there.

I think it makes sense to move Progress to Cabal. If the solver needs something different, we can add another log type to cabal-install.