Refactoring for safe entanglement

plaidfinch commented 6 years ago

I think once we clarify some of the code in entangleShape we can merge with master.

I think it might be worth continuing to expose the unsafe versions of entangle and entangleShape, which would make this change non-breaking. What do you think?

plaidfinch commented 6 years ago

To ensure everything's in one place: the context of this discussion is that @janpath commented:

Sure, I actually had an explanation in the comments at some point, but that comment was later rewritten to the above.

Since we are going recursively through the whole shape of a and build it back up into the resulting IO action, without the unsafeInterleaveIO we would force the entire shape of a when we execute that action. This can be easily seen by removing unsafeInterleaveIO from the above and trying entangleShape [1..]. You will find that it diverges. By using unsafeInterleaveIO we can build up the value as it is forced. One important thing to note is that the demand action does not use unsafeInterleaveIO and therefore always executes entirely when it is executed, and thus the result of it does not depend on when its value is forced. But by making sure, we do not match on a unless, it was already forced (the IO (Thunk a) action returns Eval a), we make sure that it does not force a any further than it already has been forced. It caused my quite some pain to get this right.

plaidfinch commented 6 years ago

This technique is clever. I have two goals before re-merging:

Convince myself that this is really referentially transparent (I think I'm mostly convinced).
Ensure that the code is as readable as possible. In key functions like this, I want others to be able to understand what's happening, so I value clarity over conciseness here. Part of what I'm going to be thinking about is how to pull this apart so it's a tad bit more clear what's going on.

All that being said, this is quite neat. I'm grateful you've taken the time to figure this out!

juliapath commented 6 years ago

You can always retrieve the unsafe versions from the safe versions using (second unsafePerformIO . unsafePerformIO) but I do not see a reason to do that for the user. As for breaking changes, the library is still very young, your presentation at ICFP was last week, so I think at this point getting the API right is more important than not making any breaking changes. If you wanted to you could deprecate them with a message on how to retrieve the unsafe versions from the safe ones.

juliapath commented 6 years ago

Some of the code looks a bit weird and unnecessarily confuse like someone was trying really hard to write something point-free, because we have to avoid pattern matching on the result of match for long enough. Previous versions that were to strict were more readable. Though I am sure that can be written in a more explicit, readable and documented way. I could look into it tomorrow.

As for referential transparency I will try to write something convincing tomorrow (It's pretty late in Germany).

juliapath commented 6 years ago

Regarding keeping the unsafe versions of entangle. I think in Haskell we usually try to use the type-system to make our APIs as safe as possible. If at some point a user needs to or wants to throw away some of the guarantees they can always use functions like unsafePerformIO, unsafeCoerce, etc.. I don't think you usually see libraries exporting unsafe versions along with their safe versions if the unsafe versions can simply be obtained by throwing some unsafe* function on the safe versions. If libraries do provide unsafe versions it is, because they are useful and they can not be obtained from the safe versions.

plaidfinch commented 6 years ago

No worries about timing, this is not urgent. The too-strict but readable variants could be made lazy enough using irrefutable pattern matches, I would think? I'll play around with it too.

juliapath commented 6 years ago

I tried irrefutable pattern matches at some point and it didn't work the way I did it. It always forced one constructor when the demand action was executed. Here is an example of what happened:

>>> (x , d ) <- second (fmap prettyDemand) <$> entangleShape [1..]
>>> (x', d') <- second (fmap prettyDemand) <$> entangleShape x
>>> d
"_"
>>> d'
"_"
>>> d
"_ : _"

You will find this in the tests I have written. There probably is a way to do it though. I think the important trick is to call entangle directly on the result of match my earlier versions called entangle on the actual a part of the result of match, but the demand action always ended up too strict in the above way.

plaidfinch commented 6 years ago

In 7d067ed6aad0fd22cd541fb77427c0c7b3b7147b, I added applicative / monadic versions of some Shaped recursion combinators. I think it might be possible to make a monadic unzipWith using these, and (with suitable use of unsafeInterleaveIO) get back a relatively simple definition of the entangleShape function.

plaidfinch commented 6 years ago

In https://github.com/kwf/StrictCheck/pull/4/commits/d511df36da7db4d594ba97d1cea2f1364a71c693, I added the monadic version of unzipWith.

juliapath commented 6 years ago

We could probably make zipWithM work for entangleShape if we used a newtype around IO, that calls unsafeInterleaveIO before (>>=).

juliapath commented 6 years ago

That would look like this:

entangleShape :: Shaped a => a -> IO (a, IO (Demand a))
entangleShape = runLazyIO
  . fmap (bimap (fuse unI) (unfoldM $ getCompose . unwrap))
  . unzipWithM (LazyIO . fmap (bimap I Compose) . entangle . unI)
  . interleave I

newtype LazyIO a = LazyIO { runLazyIO :: IO a }
  deriving Functor
instance Applicative LazyIO where
  pure = LazyIO . pure
  (<*>) = ap
instance Monad LazyIO where
  LazyIO mx >>= ((runLazyIO .) -> f) = LazyIO $ unsafeInterleaveIO mx >>= f

juliapath commented 6 years ago

That Monad instance is actually unlawful as (pure =<<) ≡ unsafeInterleaveIO ≢ id. It's not a big problem, but this would be one way to fix it.

newtype LazyIO a = LazyIO { runLazyIO :: IO a } deriving Functor
-- By using a smart constructor we make sure that any computation inside LazyIO
-- is already lazy. So that unsafeInterleaveIO ≡ id. deriving Functor
lazyIO :: IO a -> LazyIO a
lazyIO = LazyIO . unsafeInterleaveIO
instance Applicative LazyIO where
  pure = LazyIO . pure
  -- Probably more performant implementation of (<*>) than ap.
  LazyIO f <*> LazyIO x = lazyIO $ f <*> x
instance Monad LazyIO where
  -- We actually have to call unsafeInterleaveIO on the resulting action after
  -- the (>>=). We already know that the argument computations are lazy.
  LazyIO mx >>= f = lazyIO $ mx >>= runLazyIO . f

plaidfinch commented 6 years ago

I worked out the types about how to phrase entangleShape in terms of unzipWith, but I've convinced myself that you're right—anything based on foldM can't be used to do this, because you can't splice the unsafeInterleaveIO into the right place in the recursion. For the record, here's what I attempted (this failed the tests):

entangleShape' :: forall a. Shaped a => a -> IO (a, IO (Demand a))
entangleShape' =
  fmap (bimap (fuse unI) sequenceObservation)
  . unzipWithM entangle'
  . interleave I
  where
    entangle' :: I x -> IO (I x, Compose IO Thunk x)
    entangle' =
      fmap (bimap I Compose) . unsafeInterleaveIO . entangle . unI

    sequenceObservation
      :: forall x. Shaped x => Compose IO Thunk % x -> IO (Demand x)
    sequenceObservation =
      fmap Wrap . traverse (translateA sequenceObservation) <=< getCompose . unwrap

plaidfinch commented 6 years ago

Note to self: Before merging into master, I want to make some small wording changes to the documentation.

plaidfinch commented 6 years ago

Note to self: Before merging into master, I want to make some small wording changes to the documentation.

This is now done.

plaidfinch commented 6 years ago

Thank you, @janpath! It's been a pleasure working with you on this.

plaidfinch / StrictCheck

Refactoring for safe entanglement #4