Add Traversing instance for Cell

miguel-negrao commented 2 years ago

Hopefully this solves #56 and #57

With this instance should resampleList and resampleMaybe be removed ?

> fst <$> steps (traverse' sumC :: Cell IO [Int] [Int]) [[1,2], [1],[3,4]] 
[[0,1],[3],[4,7]]
> fst <$> steps (traverse' sumC :: Cell IO (Maybe Int) (Maybe Int)) [Just 1, Nothing, Nothing, Just 2, Just 3, Nothing] 
[Just 0,Nothing,Nothing,Just 1,Just 3,Nothing]

miguel-negrao commented 2 years ago

Before the last commit my version of traverse' was not as strict as resampleList:

> take 4 $ head $ fst $ runIdentity $ steps (traverse' (arr id) :: Cell Identity [Int] [Int]) [[1..],[2],[3]]
[1,2,3,4]
> take 4 $ head $ fst $ runIdentity $ steps (resampleList (arr id) :: Cell Identity [Int] [Int]) [[1..],[2],[3
...

After the last commit both of the expressions above result in bottom.

traverse' for Cell uses strict StateT which seems to be as strict as resampleList. On the other hand traverse' for ArrM is not as strict as resampleList, as traverse is not strict. To correct this a strict version of traverse is used. The code from that is taken from code from this post.

miguel-negrao commented 2 years ago

Refactored to its own module LiveCoding.Cell.Profunctor and added tests.

turion commented 2 years ago

I'm unsure about the strictness. I agree that there should be some strictness, but I think the examples shouldn't be so strict. In fact, I'm more and more convinced that I implemented resampleList incorrectly.

Some code examples:


testSetup x = take 4 $ head $ fst $ runIdentity $ steps (traverse' (arr id) :: Cell Identity [Int] [Int]) x

shouldNotBeBottom = testSetup [[1..],[2],[3]]

shouldBeBottom = testSetup [1 : error "boom", [2], [3]]

I think it should only diverge if there is a a problem in the list, but not if there is a problem in the data. That way we can be sure the list is processed completely, but leave it to the cell itself to decide whether it needs to evaluate all the data inside.

Can you add a few tests that capture the expected behaviour? If you use error or undefined instead of an infinite list, you can use assertThrows or exception handling to make sure it explodes in the right cases.

turion commented 2 years ago

Hopefully this solves #56 and #57

Yes, it's looking that way!

With this instance should resampleList and resampleMaybe be removed ?

No, I think they should be replaced with type-specific aliases resampleMaybe = traverse'. The same with resample, which works over a Vector.

miguel-negrao commented 2 years ago

I'm more and more convinced that I implemented resampleList incorrectly.

I had implemented it this way to match the strictness in resampleList, so that changing to an implementation of resampleList based on traverse' would keep the same behaviour.

I think it should only diverge if there is a a problem in the list, but not if there is a problem in the data. That way we can be sure the list is processed completely, but leave it to the cell itself to decide whether it needs to evaluate all the data inside.

Ok, in that case I will change to using traverse and lazy StateT.

turion commented 2 years ago

Ok, in that case I will change to using traverse and lazy StateT.

Will that evaluate the list structure strictly? I'd expect it doesn't.

turion commented 2 years ago

Some background: I introduced the additional strictness in https://github.com/turion/essence-of-live-coding/pull/19. In fact, we have strictness in the data passing through cells:

  Cell state2 step2 . Cell state1 step1 = Cell { .. }
    where
      cellState = Composition state1 state2
      cellStep (Composition state1 state2) a = do
        (!b, state1') <- step1 state1 a
        (!c, state2') <- step2 state2 b
        return (c, Composition state1' state2')

This surprises me a lot. I think this should be rethought first.

miguel-negrao commented 2 years ago

Will that evaluate the list structure strictly? I'd expect it doesn't.

The traversal of the traversable structure is lazy when using State.Lazy:

Prelude Control.Monad.Trans.State.Lazy> take 2 $ fst $ flip runState 0 $ traverse (\_ -> modify (+2) >> get) [1..]
[2,4]

On the other hand from the code of the composition of Cell above it is probably going to evaluate the whole list before passing it on to the next cell. So the behavior might be different if we just test a single cell or if we test the composition of two cells (i.e. cell vs cell >> (arr id) ). I will create some tests for this, so we can check on the behavior.

But I suspect it won't matter much if we use Strict or Lazy State when the cell is composition of two cells...

miguel-negrao commented 2 years ago

I've refactored the PR:

Renamed Choice to CellChoice so it doesn't clash with profunctors' Choice.
Moved code to Cell.lhs
traverse' is using Control.Monad.Trans.State.Lazy.
Reimplemented resampleList, etc as just traverse'.
Added tests for traverse''s lazyness.

This implementation keeps the list created in resampleList lazy. This is the case even when composing with another Cell. I suppose that is the case because the bang pattern used in cell1 >>> cell2 on a list will evaluate the list into either Nil or _ : _, which doesn't force the entire list, right ?

Check if this addresses all the issues.

turion commented 2 years ago

* Renamed `Choice` to `CellChoice` so it doesn't clash with profunctors' `Choice`.

I'd prefer it if Choice from profunctor were aliased. Does that work as well?

* `traverse'` is using `Control.Monad.Trans.State.Lazy`.
* Added tests for `traverse'`'s lazyness.
This implementation keeps the list created in resampleList lazy. This is the case even when composing with another Cell. I suppose that is the case because the bang pattern used in cell1 >>> cell2 on a list will evaluate the list into either Nil or _ : _, which doesn't force the entire list, right ?

So an input of "ok" : error "oh noez" would not crash? I guess you're right, we don't seq the whole list structure, but only the first constructor when we do the bang pattern on the value passing through the cell.

Can you check whether the SpeedTest in essence-of-live-coding-yampa-speedtest still doesn't have a space leak? This used to be one of the reasons for the strictness change. (I tried to check with memory profiling, but I couldn't do it.)

miguel-negrao commented 2 years ago

* Renamed `Choice` to `CellChoice` so it doesn't clash with profunctors' `Choice`.
I'd prefer it if Choice from profunctor were aliased. Does that work as well?

Sure, no problem, will do that :). If the user needs to use profunctor's Choice then they must import qualified by themselves, and profunctor's Choice probably is not going to be used.

This implementation keeps the list created in resampleList lazy. This is the case even when composing with another Cell. I suppose that is the case because the bang pattern used in cell1 >>> cell2 on a list will evaluate the list into either Nil or _ : _, which doesn't force the entire list, right ?

So an input of "ok" : error "oh noez" would not crash? I guess you're right, we don't seq the whole list structure, but only the first constructor when we do the bang pattern on the value passing through the cell.

Yes, it doesn't crash. The tests that I added are testing this.

  , testCase "traverse' by itself does not force the entire list (Cell)" $ 
    testError "Traverse' is too strict" id $ 
    head $ head $ fst $ runIdentity $ steps (traverse' cellId :: Cell Identity [Int] [Int]) [[1, error "Bang !"]]

Can you check whether the SpeedTest in essence-of-live-coding-yampa-speedtest still doesn't have a space leak? This used to be one of the reasons for the strictness change. (I tried to check with memory profiling, but I couldn't do it.)

I profiled SpeedTest and I didn't see any space leak. The output has a NaN, is that normal ? Looking at the code of SpeedTest, I think that it doesn't use any resample function, so there's no reason why it should change behavior right ?

essence-of-live-coding$ stack exec --profile -- SpeedTest +RTS -p -hc
(NaN,NaN,-5.56184561945215e306)
essence-of-live-coding$ stack exec -- hp2ps -e8in -c SpeedTest.hp
essence-of-live-coding$ xdg-open SpeedTest.ps

Captura de ecrã de 2022-03-05 18-22-57

turion commented 2 years ago

This implementation keeps the list created in resampleList lazy. This is the case even when composing with another Cell. I suppose that is the case because the bang pattern used in cell1 >>> cell2 on a list will evaluate the list into either Nil or _ : _, which doesn't force the entire list, right ?

So an input of "ok" : error "oh noez" would not crash? I guess you're right, we don't seq the whole list structure, but only the first constructor when we do the bang pattern on the value passing through the cell.

Yes, it doesn't crash. The tests that I added are testing this.
  , testCase "traverse' by itself does not force the entire list (Cell)" $ 
    testError "Traverse' is too strict" id $ 
    head $ head $ fst $ runIdentity $ steps (traverse' cellId :: Cell Identity [Int] [Int]) [[1, error "Bang !"]]

I think that's not the same thing what I meant. The test I meant would have as input not [[1, error "Bang !"]], but [1 : error "Bang !"]. This is subtly different.

Can you check whether the SpeedTest in essence-of-live-coding-yampa-speedtest still doesn't have a space leak? This used to be one of the reasons for the strictness change. (I tried to check with memory profiling, but I couldn't do it.)

I profiled SpeedTest and I didn't see any space leak.

Awesome, thanks :)

The output has a NaN, is that normal ?

Yes, I think at some point it exceeds the limits of a Double.

Looking at the code of SpeedTest, I think that it doesn't use any resample function, so there's no reason why it should change behavior right ?

True, I'm puzzled a bit. I thought it used it somewhere, but maybe that was an older temporary version of the test.

Or maybe I simply changed the definition of >>>, and thought that I'd have to change the definition of resampleList analogously for consistency.

Anyways. Your choice of lazyness seems to be a good one then :) let's keep it like it is now.

miguel-negrao commented 2 years ago

I think that's not the same thing what I meant. The test I meant would have as input not [[1, error "Bang !"]], but [1 : error "Bang !"]. This is subtly different.

Ah, that is quite subtly different indeed. So one is e 1 : _|_ : Nil and the other 1 : _|_ , right ?

In any case, it also doesn't cause an error. I will change the tests to use 1 : _|_ instead of 1 : _|_ : Nil.

miguel-negrao commented 2 years ago

Ok, I squashed the PR as it was too complicated solving conflicts with master otherwise.

Added CellIdentitySimulation to test whether one cell behaves the same as another cell. Added tests for law traverse' (arr f) = arr (f <$>) Added tests for traverse' with arbitrary Cells created with random state functions (s -> a -> (b,s)) which don't operate on the monad. It's pretty awesome that quickcheck can create random functions !

Ok, I think this is ready for review again.

miguel-negrao commented 2 years ago

I've added a mechanism to succinctly generate tests for all Traversable instances of interest. It uses induction via type-classes to go through a type-level list and generate a test for each type. If there are any other Traversables that we are interested in, I can add them to the list. Also, I rebased on top of master.

issues:

I'm getting a warning from haskell language server which I think is bogus:
```
Pattern match(es) are non-exhaustive
In an equation for ‘traverseCellTest’:
Patterns not matched:
    Proxy _ (Fun _ _) []
    Proxy _ (Fun _ _) (_:_)compile(-Wincomplete-patterns)
```
related to traverseCellTest _ s (Fn2 f) input = CellSimulation ... where I'm using the view pattern Fn2 (same for Fn).
I tried using less Proxies and more type application but I don't know how to tell GHC that a type a inside has free type variables, i.e., a instead of Proxy t -> a. Anyway, it works fine with Proxys.

miguel-negrao commented 2 years ago

Since more PRs area appearing, it would be nice to be able to finish this one :-). Our last discussion was about the tests.

turion commented 2 years ago

Since more PRs area appearing, it would be nice to be able to finish this one :-). Our last discussion was about the tests.

Yes :) I'm waiting for your changes.

miguel-negrao commented 2 years ago

Ah, sorry, I think I missed something :-). which changes are you referring to ?

turion commented 2 years ago

If you scroll up a bit, you should see a lot of pending discussions :wink: Mostly of stylistic nature.

miguel-negrao commented 2 years ago

Hi @turion :-). I went through all the discussions above again, there were one or two which were still marked unresolved, and I changed them now to resolved, but in any case I believe my last push from March had hopefully already addressed all the issues (at least that was my intention :-) ). In any case, I might have missed something, off course.

miguel-negrao commented 1 year ago

Sorry for leaving this PR abandoned for so long :-). I've tried to address all the points of the previous review. There are still 2 unresolved conversations above, but I've pushed my current version of the code.

miguel-negrao commented 1 year ago

Ok, I think I have addressed all the issues of the last review.

miguel-negrao commented 1 year ago

Thanks for the review ! Will do the changes and report back.

miguel-negrao commented 1 year ago

Hopefully I have addressed the suggestions in the last review. As always, note that when you think this is ready to merge I can first squash que commits, but for now I'm leaving the individual commits to make it easier to see my latest changes.

miguel-negrao commented 1 year ago

Hi, perhaps this is ready to merge ?

miguel-negrao commented 1 year ago

Great ! It took a while, but is was worth discussing things and polishing the code ! Thanks for your reviews ! :-)

turion / essence-of-live-coding

Add Traversing instance for Cell #95