vacationlabs / haskell-webapps

Proof-of-concept code for a typical webapp implemented in various Haskell libraries/frameworks
MIT License
134 stars 21 forks source link

Should tenant status be lifted to the type level? #7

Open saurabhnanda opened 8 years ago

saurabhnanda commented 8 years ago

Related thread on Reddit

Is there a distinct advantage of "lifting" the tenant status to the type-level? For example, it can allow us to write functions like:

data TenantInactiveReason = NewTenant | DisabledTenant | UnpaidTenant
data TenantStatus = ActivteTenant | InactiveTenant TenantInactiveReason
data Tenant = Tenant TenantStatus (TenantPoly a b c d e f)

createTenant :: (Tenant NewTenant) -> (Tenant (InactiveTenant NewTenant))
disableTenant :: (Tenant ActiveTenant) -> (Tenant (InactiveTenant DisabledTenant))
createInvoice :: (Tenant ActiveTenant) -> Invoice
remindOverduePayment :: (Tenant (InactiveTenant UnpaidTenant)) -> Email
activateTenant :: (Tenant (InactiveTenant x)) -> (Tenant ActiveTenant)

How would one implement something like this? How does one keep the tenantStatus field within the Tenant record, in sync with the "tag" at the type level?

BartAdv commented 8 years ago

What does (Tenant (InactiveTenant NewTenant)) mean? I can see how accepting various statuses as an input could be usable, but if you produce, then I guess it's gonna have some exact status?

saurabhnanda commented 8 years ago

What does (Tenant (InactiveTenant NewTenant)) mean?

A tenant, in the inactive state, because it's a new-tenant and is awaiting email verification. The third term is the reason why the tenant is in an inactive state.

BartAdv commented 8 years ago

Ah, thanks. I guess including the definitions would clarify things here.

saurabhnanda commented 8 years ago

Ah, thanks. I guess including the definitions would clarify things here.

Done :)

BartAdv commented 8 years ago

I'm curious myself how far one can go with it. The obvious advantage would be that you now can't write code that would call for example disableTenant on an inactive tenant, but I do wonder, when you load your Tenant from a database, how then you provide the specific type? After all, your loading function would have variable in place of TenantStatus?

loadTenant :: id -> Tenant .....
saurabhnanda commented 8 years ago

I'm curious myself how far one can go with it. The obvious advantage would be that you now can't write code that would call for example disableTenant on an inactive tenant, but I do wonder, when you load your Tenant from a database, how then you provide the specific type? After all, your loading function would have variable in place of TenantStatus?

Smart constructors? Load the DB row into an intermediate data-type, check the tenantStatus field and construct a Tenant TenantStatus value from it. Do you see any issues with that?

BartAdv commented 8 years ago

But you have one function to load those! What would be its signature?

Something like:

loadTenant :: forall st. st :: TenantStatus => id -> IO (Tenant st)

(pardon if it's not syntactically correct, never been doing anything like that in Haskell).

Then, possible implementation for disableTenant handler:

runDisableTenant :: id = do
  tenant <- getById id
  case (tenantStatus tenant) of
    ActiveTenant -> disableTenant tenant -- it should ok to call it, but how to convince type-checker?
   _ -> error
BartAdv commented 8 years ago

I've recalled I was implementing custom servant authentication, and it is kinda similar. I'm not implying this is the way it should be done (with typeclasses), but that was the answer I got when I asked around.

https://gist.github.com/BartAdv/42e62b4d29ffd153c99821fbc47623bd

The relevant bit is here: https://gist.github.com/BartAdv/42e62b4d29ffd153c99821fbc47623bd#file-servant-auth-L19

The cookie is decoded and we deserialize the AuthUser back, its role is unknown in type (r). Then, we perform a role check, case authUserRole user of - if user is an admin, then it gets authorized. If user is an Employee, then we need to compare if the requested role roleProxy is an Employee as well. For this we perform some unproxy magic, which is quite noisy:/

saurabhnanda commented 8 years ago

But you have one function to load those! What would be its signature?

Wow! Didn't realise that this would be so hard. How do library functions like parseJSON and readString do this?

BartAdv commented 8 years ago

They don't - they are reading some type for you, and if you employ some higher order stuff then you probably stick some foralls so that what you read doesn't have those guarantees. And it's natural, after all, you can't expect a status when you read something from db (well, unless you query by status, that is). It's up to you to consume the data, check that it fits the requirements (but the point is, the compiler will tell you if it doesn't!), and once everything is asserted you can call your domain functions.

What you're exploring here, dependent typing, is still rather young, so I wouldn't expect any "best practices"-level answers. People are still exploring this, and if you want you should probably just try on your own see if it can solve any problems for you. But since you seem to be exploring "just Haskell" itself, it might be too much to do everything at once.

saurabhnanda commented 8 years ago

if you employ some higher order stuff then you probably stick some foralls so that what you read doesn't have those guarantees

Don't think I understand this very well.

Btw, stuff like parseJSON and readString need some sort of type annotation or type-inference that constraints the return-type polymorphism, right? If we have a function like this:

findTenantsByName :: Text -> [Tenant status]

then, basically there is no easy way to constraint status in the return-type, because it could be anything!

BartAdv commented 8 years ago

Yes - and this is what I meant by forall, I think this effectively makes this signature looking like (if we had an explicit forall):

forall (status :: TenantStatus). findTenantsByName :: Text -> [Tenant status]

// EDIT

To be more specific, when you look at signature of parseJSON for example, you see:

parseJSON :: Value -> Parser a

this means the a is of kind *, so the Tenant wouldn't fit here, as it's of kind * -> *, that is, it accepts TenantStatus to produce a type (now when I look at the type declarations in the first post I think there are some mistakes there, data Tenant = Tenant TenantStatus ...would mean it's of kind*). So, in order to make it work you writeFromJSONinstance not forTenant, but forTenant st`, that is, you move that type argument in the front of, making it universally quantified.

sudhirvkumar commented 8 years ago
data TenantInactiveReason = NewTenant | DisabledTenant | UnpaidTenant
data TenantStatus = ActivteTenant | InactiveTenant TenantInactiveReason
data Tenant = Tenant TenantStatus (TenantPoly a b c d e f)

createTenant :: (Tenant NewTenant) -> (Tenant (InactiveTenant NewTenant))
disableTenant :: (Tenant ActiveTenant) -> (Tenant (InactiveTenant DisabledTenant))
createInvoice :: (Tenant ActiveTenant) -> Invoice
remindOverduePayment :: (Tenant (InactiveTenant UnpaidTenant)) -> Email
activateTenant :: (Tenant (InactiveTenant x)) -> (Tenant ActiveTenant)

I believe we can use a record to store detailed information about the tenant? rather than solving everything with ADT?

I do understand that you want to enforce the types to determine the correctness of the function. right?

saurabhnanda commented 8 years ago

I believe we can use a record to store detailed information about the tenant? rather than solving everything with ADT? I do understand that you want to enforce the types to determine the correctness of the function. right?

The straightforward and simpler way is to just run with a single Tenant record. But, I'm trying to extract more static guarantees from Haskell. However, the reason why it's turning out to be way tougher than I originally thought is because of https://github.com/vacationlabs/haskell-webapps/issues/7#issuecomment-250468315

sudhirvkumar commented 8 years ago

Basically you want to make sure that a function return a tenant with a specific status.

We can move that to tests right?

wz1000 commented 8 years ago

A simple way to do this for a fixed, rarely changing definition of TenantStatus is

data TenantStatus = Active | Inactive
data Tenant (s :: TenantStatus) = T {name :: Text, ....}
data SomeTenant = ActiveTenant (Tenant Active) | InactiveTenant (Tenant Inactive)
getActiveTenant :: SomeTenant -> Maybe (Tenant Active)
getInactiveTenant :: SomeTenant -> Maybe (Tenant Inactive)

then

loadTenant :: ID -> IO (SomeTenant)
findTenantsByName :: Text -> [SomeTenant]

to automatically do this for arbitary, changing definitions of TenantStatus, existential types are needed.

{-# LANGUAGE TypeApplications, ExistentialQuantification, ScopedTypeVariables, DataKinds, KindSignatures #-}
import Data.Typeable
import Unsafe.Coerce

data SomeTenant = forall (s :: TenantStatus). S TypeRep (Tenant s)

toSomeTenant :: forall (a :: TenantStatus). Typeable a => Tenant a -> SomeTenant
toSomeTenant t = S (typeRep $ Proxy @a) t

getTenant :: forall (a :: TenantStatus). Typeable a => SomeTenant -> Maybe (Tenant a)
getTenant (S t v) 
   | t == typeRep (Proxy @a) = Just (unsafeCoerce v)
   | otherwise = Nothing

then

tenant :: Tenant Active
tenant = ...
getTenant (toSomeTenant tenant) :: Maybe (Tenant Active) == Just <something>
getTenant (toSomeTenant tenant) :: Maybe (Tenant Inactive) == Nothing
saurabhnanda commented 8 years ago

@wz1000 thanks for taking a stab at this. Seems like something that could work. If I've understood this correctly, you're working with a more general 'SomeTenanttype and converting to anActiveTenantorInactiveTenant` where we need to be sure of the tenant's status. Is that correct?

Can this be simplified a bit to make it easier to use? Something on the lines of:

{#- LANGUAGE DataKinds -#}

data TenantStatus = TenantActive | TenantInactive
data Tenant (status :: TenantStatus) = Tenant{tenantStatus :: Text, ...} -- should 'status' be a phantom type?
loadTenant :: TenantID -> Tenant status
findTenantByName :: Text -> [Tenant status]
mkActiveTenant :: Tenant status -> Maybe (Tenant TenantActive)
activateTenant :: Tenant Inactive -> Tenant Active
saurabhnanda commented 8 years ago

@wz1000 And thinking about this pragmatically, isn't this simply pushing the "data validation or sanity checks" from activateTenant to mkActiveTenant? Is this really achieving anything in real terms? How bad would the following be:

activateTenant :: Tenant -> Either DomainError Tenant
activateTenant tenant@{tenantStatus=TenantActive, ...} = Left InvalidTenantStateError
wz1000 commented 8 years ago

@saurabhnanda

loadTenant :: TenantID -> Tenant status

cannot work because it desugars to

loadTenant :: forall status. TenantID -> Tenant status

so loadTenant <id> can be both an active and an inactive tenant, regardless of the actual state of the underlying tenant. This is because GHC doesn't have inline existentially qualified types yet, and all type variables are universally quantified.

And thinking about this pragmatically, isn't this simply pushing the "data validation or sanity checks" from activateTenant to mkActiveTenant? Is this really achieving anything in real terms?

That is exactly what it is doing. However, the distiction between code that requires a Tenant with a particular status, and code which doesn't need to know anything about the status will be statically enforced by the type system. With your version, if the sanity check is somehow left out, we would never get a warning. Also, every function that deals with Tenants would have to have some logic to handle the state of the tenant. Even if you, the programmer, are sure that you will only ever call the function with an Active Tenant, you will still have to handle the case where the Tenant is Inactive

By promoting the status to the type level, we would have to perform the sanity check once before using the Tenant, but then never have to bother about it in code again. Each function would simply describe, in its type, the kind of Tenant it expects.

Also, the existential quantification dissapears when the Tenant appears in a negative position. So if a function doesn't depend on the status of the Tenant, we can simply write

f :: forall status. Tenant status -> IO ()

instead of

f :: SomeTenant -> IO ()

making the code a bit less clumsy. SomeTenant would only be used when producing a Tenant of indeterminate status.

BartAdv commented 8 years ago

@wz1000 What does "inline existentially quantified" means?

wz1000 commented 8 years ago

@BartAdv I mean without an explicit data declaration.

BartAdv commented 8 years ago

@wz1000 and what about:

loadTenant :: TenantID -> IO (forall (status :: TenantStatus). Tenant status)

with

{-# LANGUAGE DataKinds #-}
{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE RankNTypes #-}
wz1000 commented 8 years ago

@BartAdv Try it. GHC would complain about not supporting impredicative types or something like that.

BartAdv commented 8 years ago

@wz1000 right, played a bit in REPL and now I know what's the problem. Also, did a bit of research how it could look in fully dependent language (Idris), and it seems I'd need dependent pairs for it:

data TenantStatus = Active | Inactive

data Tenant : TenantStatus -> Type where
  MkTenant : String -> (status : TenantStatus) -> Tenant status

tenants : List (Int, (status ** Tenant status))
tenants = [(1, (Active ** MkTenant "John" Active))
          ,(2, (_ ** MkTenant "Mike" Inactive))] -- can just use underscore to let compiler infer it

loadTenant : Int -> Maybe (status ** Tenant status)
loadTenant tid = lookup tid tenants