advancedtelematic / quickcheck-state-machine

Test monadic programs using state machine based models
Other
203 stars 25 forks source link

Use only Concrete abstraction (remove Symbolic abstraction) ? #246

Closed akegalj closed 5 years ago

akegalj commented 5 years ago

I guess this is chapter 2 of our discussion about mock function :)

tldr; What would we lose if framework wouldn't use mock, there was no Symbolic model and everything is default to Concrete:

transitions :: Model Concrete -> Action Concrete -> Response Concrete -> Model Concrete
...

where Action Concrete is actually action generated by our model (which should probably be called Action Symbolic - but in this case we could remove Concrete type and default to it)

?

Would we lose some properties? (maybe race condition tests depend on mock thing?) If we constrain framework this way, and that way losing potential experiments with mock function though.

This would enable to approach writing tests from a black bock perspective, without going too deep into details of how things are implemented.

Hm, maybe we could have both modes - and leave to the user to opt for mock (default) or without mock.

Long story

This is my rubber duck debugging of the problem that I run into. It can be useful to you to see my thought process. You will notice how I had wrong assumptions how framework is working (mostly confusion with Symbolic/Concrete/mock thing). On the second run I have left EDIT messages where I explain my faulty conclusions (I left all in purpose as I believe there might be more people running into same trap as I did)

My proposed "fix" is: a) having optional mock/Symbolic thing in the framework b) extending documentation to better capture idea behind mock/Symbolic/Concrete so that people avoid getting into same trap as me (I hope I am not the only one - as that would make me feel dumb)

I am volunteering for both, but would like to hear your thoughts before.

Bellow is my thought process

Model

This https://github.com/input-output-hk/cardano-sl/blob/akegalj/co-412/symbolic-model/wallet-new/test/state-machine/Wallet.hs iz a stripped version of my current wip in adding state machine tests for testing our wallet api.

This example contains only three actions:

Resetting the state is required to be a first action in every test item (so first action in generated list of actions). Its function is, as name suggests, to reset the state (as we are reusing the same state across the runs. Underneath it acts as resetting persisted db)

Creating a wallet is a basic function in our api. Its an IO operation that will do some crypto magic (to derive some ids) and it will persist this wallet to db.

Dummy action is any other action.

Action, response and model are defined as:


data Action (r :: * -> *)
    = ResetWalletA
    | CreateWalletA WL.CreateWallet
    | DummyAction

data Response (r :: * -> *)
    = ResetWalletR
    | CreateWalletR (Either WL.CreateWalletError V1.Wallet)
    | DummyResponse

data Model (r :: * -> *) = Model
    { mWallets     :: [(V1.Wallet, Maybe V1.SpendingPassword)]
    , mUnhappyPath :: Int
    , mReset       :: Bool
    }

Generator and transitions

Generator, and precondition functions are defined as:

generator :: Model Symbolic -> Gen (Action Symbolic)

preconditions :: Model Symbolic -> Action Symbolic -> Logic

My understanding is that Symbolic type variable defines our model: generated actions, expected response, and expected model. Concrete type variable defines real system that we are testing: generated action (this will be same as symbolic), system response (response from the system), system model.

EDIT:

After this whole post had been written I understand now that what I said above ^ is not really correct. Symbolic and Concrete are both our test models, where symbolic is strongly related to mock function. There are other other differences explained at the bottom of this post. Above I thought that Model Concrete is some kind of abstract way marking real model within IO from a real system that we are testing. I know now that was false (I will probably make PR soonish to extend documentation to explain these bits in more details - maybe with an example)

To me it seems preconditions function is a real subset of generator function and we should be able to move all the logic from preconditions to generator. I see preconditions function just as a convenience to separate the logic a bit (hope I am right). So I will leave out preconditions and focus only on generator in this analysis.

In generator, we would like to reset the wallet as a first action in list of actions. We accomplish this by using mReset flag in our model and defining generator:

generator (Model _ _ False) = pure ResetWalletA

and defining transition (simplified - I am going to ignore increaseUnhappyPath and shouldNotBeReachedError in this example):

    (ResetWalletA, ResetWalletR) -> Model mempty 0 True

Now we are sure ResetWalletA will be run as a first action in a list of actions.

Wallet should be generated before other actions (which are represented by DummyAction in this example) on the wallet. To do some meaningful actions on the wallet we should first create a wallet:

generator (Model [] _ True) = CreateWalletA . WL.CreateWallet <$> genNewWalletRq

transition is defined as (simplified):

    (CreateWalletA (WL.CreateWallet V1.NewWallet{..}), CreateWalletR (Right wallet)) ->
        model { mWallets = (wallet, newwalSpendingPassword) : mWallets }

Otherwise generator will generate dummy action:

generator _ = pure DummyAction

I was expecting to see list of actions generated [ResetWalletA, CreateWalletA, DummyAction, DummyAction, ..., DummyAction] but what I got was [ResetWalletA, CreateWalletA, CreateWalletA, ..., CreateWalletA].

This was 'aha' moment when I figured out that mock function is relevant and that my current understanding https://github.com/input-output-hk/cardano-sl/blob/akegalj/co-412/symbolic-model/wallet-new/test/state-machine/Wallet.hs#L203 :

mock is not used in a current quickcheck-state-machine-0.4.2

is wrong.

Mock

Mock is defined as:

mock :: Model Symbolic -> Action Symbolic -> GenSym (Response Symbolic)

and transition function is defined as:

transitions :: Model r -> Action r -> Response r -> Model r

where in practice r will be either Symbolic or Concrete. What this tells me is that framework will use:

mock :: Model Symbolic -> Action Symbolic -> GenSym (Response Symbolic)
transitions :: Model Symbolic -> Action Symbolic -> Response Symbolic -> Model Symbolic

to make a progress in our test model. As our generator depends on Model Symbolic this means that our generator depends on mock function. This is also not ideal as our mock function is a pure function and it is hard to really mock up expected response without reaching for IO. In our example, create wallet is using some crypto primitives to derive wallet ids (and other stuff). If we would like to make this pure, we would have to reimplement much of crypto primitives and make them pure (to take seed as a parameter). This seems impractical in real world scenario as we sometimes use third party libs that don't expose this functionality. I will show example later.

EDIT:

Later in text I explain that implementing mock function is not catastrophic as I thought above ^^. I still don't see benefit of it though.

Framework will use semantics for real model:

semantics :: Action Concrete -> IO (Response Concrete)
transitions :: Model Concrete -> Action Concrete -> Response Concrete -> Model Concrete

My understanding is that Action Concrete will in fact be action generated by our generator. We already said that this generator depends on mock function so semantics and transitions also depend on this generator. In fact, everything then depends on the mock function which wasn't what I was expecting (as said, its hard to purely mock more complex stuff). Also there is, non ideal, situation in real world software engineering team where person writing a test is not really the same person that implemented the feature - going really deep and low level like this would be really time consuming (without, at first sight, noticed benefits).

Looks like transitions :: Model Concrete -> Action Concrete -> Response Concrete -> Model Concrete will not in fact execute our defined function. semantics :: Action Concrete -> IO (Response Concrete) is the function that will modify Model Concrete which is in fact behind IO . We use transitions :: Model Concrete -> Action Concrete -> Response Concrete -> Model Concrete just to purely define what will happen in real world scenario (maybe its not working like this under the hood - but this is my understanding of it as a framework user).

EDIT:

later I figured this ^ was wrong. We do have both Model Symbolic and Model Concrete in our test model (and there is a third real state behind IO that runs the real thing. I confused this state behind IO with Model Concrete above)

Mock is defined as:

mock _ ResetWalletA      = pure ResetWalletR
mock _ (CreateWalletA _) = pure $ CreateWalletR (Left $ WL.CreateWalletError Kernel.CreateWalletDefaultAddressDerivationFailed)
mock _ DummyAction = pure DummyResponse

we were unable to more closely implement CreateWalletA part as it is very low level and requires deep domain knowledge. It also most likely depends on IO which we don't have here. I am not sure having IO available would be any benefit as I would probably "mock" it the same (or very similar) way as the real thing is implemented.

As we our Response Symbolic was always CreateWalletR (Left ...) we were in fact unable to add a wallet to our Model Symbolic. That explains why our generator never generates DummyAction. For this to work correctly we would have to reimplement low level bits of wallet creation which would be time consuming task (without noticeable benefits?).

Current framework has a safenet in terms of default test coverage that are run and which test are all actions run - and this mistake would be caught (in this example) by test coverage. If generator was defined as:

generator _ = frequency
    [ (1, pure ResetWalletA)
    , (1, CreateWalletA . WL.CreateWallet <$> genNewWalletRq)
    , (1, pure DummyAction)
    ]

we would be unable to notice the mistake and test coverage would run just fine, but we are not aware that wallet wasn't created even once in our symbolic model, mWallets will remain empty and our generator distribution might be effected by it (not in this example, as distribution is uniform here and doesn't depend on symbolic model).

Transitions and postconditions

When I have added bunch of traces in transitions and postconditions functions (I could have just read the source code as well though) I have concluded:

Seems like each state transition looks like:

  do
    run symbolic generator
    run symbolic precondition
    run symbolic mock
    run symbolic transitions
    run concrete semantics
    run concrete transitions
    run concrete postconditions

I wonder what is added benefit of having mock function and Symbolic abstraction? Why just not simplify it and do:

  do
    run concrete generator
    run concrete precondition
    run concrete mock
    run concrete transitions
    run concrete semantics
    run concrete transitions
    run concrete postconditions

where we would use Model Concrete everywhere. mock would be removed and generator wouldn't be effected by it? It practice it doesn't seem to be a problem if either: 1) mock is correctly implemented 2) we decide not to use model variable in our generator and preconditions for anything non trivial (ResetWalletA was simple to mock)

On another hand seems like Model Symbolic and Model Concrete don't effect each other so mocking operation should just in fact construct responses that look okish. We don't have to bother with using real crypto here and using IO (I guess now I understand what mocking really is about). Question still remains of do we have any benefits of using mock and Symbolic abstraction? (in current version). Implementation of mock adds extra work as one has to dive to implementation details. There might be smart constructors that wouldn't allow creating mocked response (or constructors might be hidden altogether).

Conclusion

So what I have learned is that mock has to be correctly implemented (where I previously thought we don't care about mock much). But I wonder does framework use Model Symbolic for something else then this? Maybe it compares Model Symbolic with Model Concrete? (I am not aware of this though, maybe it wouldn't make sense to compare them - would have to think about it)

Did you encounter any benefits of using both Symbolic and Concrete approach compared only to Concrete.

I will try to correctly implement mock for CreateWalletA and check is it really hard as it seems.

stevana commented 5 years ago

First of all, I'm glad you are challenging the use of mock and Symbolic/Concrete. They are for sure the least intuitive aspect of the library, and I'm by no means certain they are necessary in their current form.

You are right that in your example you are not really using Symbolic. And you are also right that there's something strange about mocking the response to CreateWalletA. It might be possible in your case, but imagine that you have an action whose response is an IORef like in the memory example -- that would be impossible to mock!

Let me show you an alternative way to do it and hopefully it will become more clear why Symbolic is needed.

Change the response to contain a reference to a wallet instead of a wallet:

CreateWalletR (Reference V1.Wallet r)

also change Model to use a reference to wallet and use CreateWalletR <$> genSym in mock to generate a symbolic reference. Use pre-conditions to constrain WL.CreateWallet in such way that CreateWalletA doesn't throw WL.CreateWalletError (by the way, pre-conditions are also used while shrinking, so they can't be moved into the generators).

What happens now is that we generate CreateWalletA, transition and mock put a symbolic reference to a wallet into the model, later actions that we generate might use the reference and their pre-conditions might depend on it. Once we execute CreateWalletA using the semantics, we get a Concrete wallet reference, we then substitute in the concrete wallet into all later actions which use the symbolic reference to the wallet when we execute those!

For example, in the memory reference example we have:

data Action r
  = Create
  | Read (Reference (IORef Int) r)
  | ...

data Response r
  = CreateR (Reference (IORef Int) r)
  | ReadR Int
  | ...

During generation, we get programs like:

[ Command Create [Var 0]
  -- ^ `Var 0` is created by `mock` and `getUsedVars`.
, Command (Read (Reference (Symbolic (Var 0)))) []
  -- ^ Empty list because `Read` returns no references.
, ...
]

Because of transition:

newtype Model r = Model [(Reference (IORef Int) r, Int)]

transition (Model model) Create (CreateR ref) = Model ((ref, 0) : model)

The symbolic reference ends up in the model during generation, and when generating Read we can check its pre-condition:

precondition (Model model) (Read ref) = ref `elem` map fst model

Note that this isn't the real IORef, it's just a symbolic reference.

Once we execute the actions using the semantics we carry around an environment with a mapping from symbolic to concrete references. For example after executing Create we have a map from Var 0 to an actual IORef Int. So that when we execute Read sref : Action Symbolic we first substitute the symbolic reference using the environment to get Read cref : Action Concrete and then use the semantics function to execute it.

Perhaps this would be a good point to stop. Does it make sense so far? Hopefully you see the point Symbolic/Concrete a bit better now?

You are right, that we can get rid of Symbolic if we don't use References, but them seem necessary in most examples I've came across so far. Including your example it seems, sorry I didn't notice that earlier! But yeah, perhaps it would make sense to have a simpler interface that doesn't use references at all...

Next I can explain how we can get rid of mock and what we lose.

akegalj commented 5 years ago

Thanks for detailed reply and suggestions! <3

I am going to go through it in details one more time and try what you have proposed by ~end of the day (CET).~ (I had to postpone this for mid of next week due too other priorities in my team. cca 21.11)

akegalj commented 5 years ago

I am closing this issue because I have been moved to different part of the project atm. Will reopen this issue when we will be improving tests after a while.