Open schatekar opened 9 years ago
Hi!
I would model something like
POST /process
{
"entity": {id},
"transition": ...
}
This would return a 202 Accepted with an ID that you could GET /process/{ID} and check it later, as the transition will be done asynchronously.
I don't think this is better than yours, it's just a different way to model this, and you could consider things like: Does the client really need to specify what process has to be executed? Or the client just has to trigger the transition?
I think using PUT entity/{id}/processN sounds a little like RPC.
What do you think?
On Dec 19, 2014 6:00 AM, "Suhas Chatekar" notifications@github.com wrote:
I am about to implement a business process which can be more or less modeled as a state machine. In abstract terms, here is how the state machine would look like
[Entity1] --process1--> [state1]--process2-->[state2]--process3-->[final state]
Every state transition is atomic operation. The reason we want to model it as a state machine is that complete operation can take long time to run and we do not want the caller to be blocked. So idea is that we would accept the request from caller and return 202 Accepted is request looks ok.
After that, a scheduled process would pick up the request from database and trigger process1 by calling a REST endpoint. Same for process2 and process3.
There are two ways that this can be modeled. First is implement a single REST endpoint like below
PUT http://api.com/entity/{id}/process
Because we know the current state of the entity, we can determine which process to execute next. But then I feel this API is not expressive and following would look better
PUT http://api.com/entity/{id}/process1 PUT http://api.com/entity/{id}/process2 PUT http://api.com/entity/{id}/process3
In the above, there is a distinct endpoint for every process that can be triggered on the entity. If you trigger process1 on an entity which is in final state then you would get back an error.
I am not sure which one is right or if both of these approaches are wrong. Anyone has any experience of doing something like this in past?
— Reply to this email directly or view it on GitHub https://github.com/interagent/http-api-design/issues/58.
I would think over this but a quick note on PUT
vs. POST
- I used PUT
because I am modifying an existing entity and not create a new one.
I think I might use an actions
pattern here, which we have had some luck with for other actions (though not necessarily the state machine pattern per se).
POST http://api.com/entity/{id}/actions/process1
POST http://api.com/entity/{id}/actions/process2
POST http://api.com/entity/{id}/actions/process3
This has the benefit of maintaining the same cadence (alternating resource/identifier in the path). I would probably also use 202 as you suggest to indicate it is accepted, rather than completed. You'll likely also want a way to query the status of the process (this might be as simple as looking up the entity and checking a status field, but depends on specifics of the use case).
As for put vs post, from RFC21616 I grabbed "The POST method is used to request that the origin server accept the entity enclosed in the request as a new subordinate of the resource identified by the Request-URI in the Request-Line". I think we can argue that the action/process is a subordinate to the entity. Whereas PUT read as "The PUT method requests that the enclosed entity be stored under the supplied Request-URI.". I think POST then is a bit more accurate (and is more commonly used for actions in my experience).
Hope that helps, certainly happy to discuss further.
I was originally thinking actions pattern while it expressive and clearly conveys the intent, I saw two issues with it
I would look up RFC21616. That is an interesting way of looking at POST and PUT. I have been using old school definition of POST is equivalent of create a resource and PUT is equivalent of update a resource.
@leandroferro
POST /process
looks too generic. I am not sure but, I feel the resource identifier should be part of URL and not be embeded in the body. What do you think?
And I agree with your comment on RPC - but isn't that nature of REST? I always find it difficult to implement sub-operations on resources because they end up looking like RPC. For instance, if you have a customer resource and you want to disable they account then I have been doing it like PUT /Customer/{id}/disable
which does look a bit awkward and RPCish.
Yeah, POST/PUT can be tricky. I end up having to refer back to the docs often to remember the specifics.
All that said, I'm not sure how easily it is to discuss generically. It might be that the best approach would be something like:
POST /entity/{id}/actions/advance
You call it repeatedly and it keeps going through the steps (202 if it isn't in the final state yet, 200 if it is). Granted this loses a lot of the subtleties, but might be appropriate in some cases.
Anything that deviates from resource/id ends up feeling weird/awkward. I think in that case I would also use actions to make it seem less disimilar at least, something like: POST /Customer/{id}/actions/disable
.
@geemus +1
I would probably have to go ahead with some of these ideas and see what results as I get. As you said there always are subtle differences in each case and it is difficult to generalize.
Dear @schatekar I really like your way of thinking! For what it's worth here are my few cents on this topic.
Modeling REST API as a Finite State Machine (FSM) is indeed the way to go. Especially when you want to create decoupled, scaleable and truly REST (read hyperlink-driven) API.
The reason we want to model it as a state machine is that complete operation can take long time to run and we do not want the caller to be blocked.
Whether this should (or should not) imply FSM I am not sure. To me FSM comes handy when modeling any REST API regardless of whether is synchronous or not.
In order to send request on the correct URL, client needs to know the status of entity. So client has to make an extra call to know the status of the entity first.
In REST, what URL and HTTP request method (endpoint / action) can be used is driven by the server, client should make no assumptions on what action is available. Instead, it should get available actions from the server and decide which one to take.
I will try to demonstrate my thought process on the API FSM design using the concept of Resource Blueprint. Since I am not familiar with your API's domain I will use the concept of building a House.
If you are interested for more details you can check my slides on the Resource Blueprint
Lets describe the Building resource:
Attributes (data) of the resources, this may be really anything...
walls
(array[wall]) - array of walls as built roof
(roof) - roof built on top of the wallspaint
(string) - name of the wall's color or empty stringA list of actions you can perform with the resource. What actions are available depends on the current state as is offered by the server. This list here list all actions resource can possibly offer. Client MUST not remember what action is available in what state, as this leads to tight coupling with the server implementation.
buildWalls
- builds wallsbuildRoof
- builds a roof on top of wallspaintWalls
- paints the wallsscrapeBuilding
- scrape everythingThe top-level list are names of the states of this resource.
Each state lists actions available in the particular state. For example, you can perform "build walls" action in the "empty property" state but not "paint walls" action.
Every listed action shows to what state you get after exercising the action. So the syntax is:
- original state
- action → destination state
Now lets look at the states.
emptyProperty
(resource entry point)
self
→ emptyProperty
Takes you to this state
buildWalls
→ wallsStanding
Takes you to the state where walls are standing
wallsStanding
self
→ wallsStanding
buildRoof
→ wallsAndRoof
scrapeBuilding
→ emptyProperty
wallsAndRoof
self
→ wallsAndRoof
paintWalls
→ buildingFinished
scrapeBuilding
→ emptyProperty
buildingFinished
self
→ buildingFinished
scrapeBuilding
→ emptyProperty
This is obviously the case for synchronous process, when all the construction happens immediately. Now, when you want to go async you can, for example introduce following states:
wallsInProgress
self
→ wallsInProgress
cancel
→ emptyProperty
roofInProgress
self
→ roofInProgress
cancel
→ wallsStanding
paintInProgress
self
→ paintInProgress
cancel
→ wallsAndRoof
and modify the existing state' transitions like so:
emptyProperty
(resource entry point)
self
→ emptyProperty
buildWalls
→ wallsInProgress
etc.
Now whether modeling the construction processes as states of the one resource or separate it under another, for example "construction process" resource, I leave up to you. The point here is how to think about FSM of this problematic.
The beauty of this abstraction is that it leaves the questions of protocols / URLs / Methods / JSONs etc for later, as they are a mere technical details and lets you focus on the way your API is designed.
Frankly, your APIs client should only focus on understanding the resource attributes, actions and its parameters. The absolute values of URLs are somewhat and methods are somewhat irrelevant (tho I suggest to pay attention to them as well).
Not sure if this musings about API design as FSM helps you with your API question, but I would be happy if this is the case!
@zdne Wow, this is very close to what I was thinking though I could not put all the piece together. I am glad you have a name for this. If I had to summarise "Resource Blueprint", what you are really saying is do not model state transitions as API endpoints. Only model states as API endpoints. Is that correct?
@schatekar I guess yes. Although I do not like to think about endpoints. If REST is "Representational state transfer" then it probably mean that we are transferring a state representation when we are hitting an endpoint. Does it make sense?
In other words, what client is getting when accessing any endpoint is a representation of a resource in certain state. The representation of the state consists of data attributes representation and affordances representations (= available state transitions).
I'm working through a similar requirement. I seem to be deeper into the HTTP RFC and REST concepts, but am interested in reactions (especially potential pitfalls) from folks who have already trod this path.
It's important that your calls are idempotent. If all of the steps use the same "process" or "advance" keyword, what happens if you have to retry a call (but the server receives both)? How do you prevent the machine from advancing twice? There are at least two solutions (I prefer the first).
PUT has specific semantics and targets a specific resource identified by a URI:
The PUT method requests that the state of the target resource be created or replaced with the state defined by the representation enclosed in the request message payload. A successful PUT of a given representation would suggest that a subsequent GET on that same target resource will result in an equivalent representation being sent in a 200 (OK) response.
This is clearly not what you'd expect if you could actually GET from an action endpoint so it doesn't have the right semantics. POST does.
@schatekar expresses the concern:
In order to send request on the correct URL, client needs to know the status of entity. So client has to make an extra call to know the status of the entity first.
A true REST interface respects HATEOAS and really should work this way:
"process2": "/entity/{id}/action/process2"
)The client needs to know what action to use, but it's best if the client treats the URI as a black box. If you changed your server-side schema, you should be able to update the URI for process2
(e.g. to /fsm/{id}/action/process2
) and your existing client would continue to function.
If transitions can occur in response to data (rather than user action), exposing the FSM may be unnecessary. Consider a business rule like "the object is approved
after 3 users approve it". Instead of creating an "approve" action, create an approval resource and an observer (hopefully generic like Django's signals).
approved
There's still an FSM under the covers, but you've refactored the interaction so:
Here's the kicker.... the pattern is more "obvious" if three approvals are needed, but it works just as well if only one approval is needed.
In a HATEOAS world, the API may not even expose the raw state. Instead, a resource in the "NeedsApproval" state includes an "approve" afforadance. Once the resource is in the "Approved" state, the affordance goes away.
POTENTIAL PITFALLS:
@zdne mentions separating entity
and state
in passing and this deserves highlighting. A house is in a lot of states simultaneously. For example, a builder might be both building and selling a house at the same time. If we're building reusable components, we don't want to couple the House data model to the Construction FSM or the Sales FSM. Separating out the logic should provide smaller components that are easier to read and test.
This approach is even more attractive if you can combine it with Data-Triggered Transitions. Now your data models are completely decoupled (but provide generic observers). Your FSMs subscribe to the relevant events (i.e. observers) and maintain their own state based on changes in the target data.
Your presentation layer still needs to be state-aware to display the right affordances. Ideally it's assembled from a lot of standard parts. For example, your final renderer inherits from a generic "house" renderer and extends the model by adding the affordances relevant to the app.
@claytondaley thanks for the detailed writeup.
I agree that more explicit transitions provide better idempotence (and similarly that they should error if you then end up calling one at the wrong time for the FSM). Ditto that POST makes good sense for this use case.
Data triggered transitions and separating resources also make good sense in many cases, but I've also certainly seen some cases where a separate resource seems like it would complicate/confuse things. For instance, with the status of a server, it seems clearer to do something directly to the server for say a reboot, than to create a reboot object which then causes a reboot. Not that a distinct object is impossible or intractable, but I think in some simple cases it can seem confusing. It seems like a judgement call situation, though I tend to prefer to avoid those (as judgement is not equally distributed among individuals). Good food for thought throughout, certainly.
@geemus
I'd like to propose (for consideration/debate) that the confusion of separate resources arises mostly from an attempt to map OOP to other domains. Representing object.action()
as object/action
is intuitive, but imagine the parallel case in DB CRUD. I could probably create a stored procedure called "activate" and write a SQL query that uses it to change an object's state, but I think most would agree that it's anti-pattern. I'm not sure what's different about REST except that most of us have a weaker understanding of how it "should" work.
===
I definitely agree with your general intuition, but I'd like to find a different/better example than power management to really test the logic of the "separate resource" argument. I think power management is an especially weak case because the actual state of the machine is 100% data driven. It's a physical property that can only be determined from data flowing from the machine.
This creates odd cases like:
reboot
and the request times out. Is there any way to know if it succeeded? Say you poll the system. Every response is state=on
. Did the system reboot (and you missed the state=off
)? Or did your command fail.shutdown
and poll the system. Every response is state=on
. Did the shutdown fail? Or did someone else power it back on between your requests?The best way to model power management is a sequence of requests. You care whether the request was fulfilled. The only way to know this (with certainty) is to expose it as a resource.
Do you disagree? If not, can you think of another strong case where resources feel especially wrong?
requested
state, you can still cancel
. Once it's in process
, you can no longer cancel. Then you can just PUT the new state (and it will 409/400 if the request is invalid).I'd probably go with:
POST /machine/{id}/power-task
{
"state": "off"
}
201 CREATED /machine/{id}/power-task/{id}
I went with task
over request
for brevity and clarity, but also considered cmd
and call
.
Assume you:
How do you path the resource? If you think of it as a one-to-one relationship, there's some discussion (e.g. here). A one-to-one resource has a lot in common with a property and I've occasionally seen properties exposed as sub-resources (similar to the 3rd item here). I keep coming back to something like:
house/{id}/construction-state
There's no terminal {id}
(the only thing that give me pause) because it's a 1-to-1 relationship. You change the state by PUTing the new state. Because the state change is not data-drive, it should be instantaneous (if valid) so you can immediately report a success or an error (409 or 400).
I think some of my struggle with more data-driven things is that it can make side effects less apparent. ie in the example you gave of approvers, how am I to know that it the 3rd one will create a side effect of changing the status of a different object? I suspect at least for some that might be surprising, where it might be less surprising in the action case (where it is more apparent that the thing you are doing relates). Some of that may just be a matter of particular pathings/etc rather than a broader issue.
I agree that the action style doesn't deal with with asynchrony, but I would argue that extends to all of REST. Having callbacks/webhooks or doing things based on events often seems smoother in these cases (though it may not be particularly tenable in all cases).
As for your example about power management, it definitely is true that you don't get great feedback on status, can't cancel, don't know what others are doing, etc. Depending on the use case, though, it may not matter. And if it doesn't matter, I suspect that something like actions is a simpler and easier to understand user experience. Though it is less semantically correct, it also limits the number of different objects the user might need to understand in order to be effective (which has it's own value which can be very subjective).
Which is all to say, I agree that modeling everything is probably more correct, but it may be at the cost of the user experience. By selectively choosing which things warrant the extra gravitas and complexity of full modeling, I suspect we can better balance the demands of correctness and experience (but it is very subjective and case-by-case). Does that make sense?
I would frame your argument as the pragmatic case against a (strict) RESTful interface... rather than the case that actions should be prominent (or even present) in a RESTful interface. Based on a quick search for REST and Actions, I think most would agree with this classification (for example, here).
If the OP (or folks in similar situations) need to manage a state machine over an HTTP API (the repo's general purpose), your arguments should probably carry considerable weight.
If the OP (or similar) actually wants to implement a state machine over a RESTful interface (the title of this issue), I don't think actions are an option.
I'm trying to explore this second path to discover worst case scenarios. Besides usability concerns, we haven't identified any major issues that are unique to REST:
object/{id}/fsm
, supporting PUT or PATCH of the state
value) if the state changes are synchronous or guaranteedIf anyone else has run into specific cases where state transition cannot be data-driven and are especially difficult to model as a resource, I'd be interested in hearing about them.
That is not an unreasonable framing. I certainly don't think actions need to be prominent, though there are some cases where something like that (or full representation) does seem necessary.
I'm afraid I don't know what OP
means in this context? Could you fill me in?
You distinction between managing and implementing is subtle, but I think important and likely correct. Thanks for talking through this and working toward that clarity.
I think a distinct endpoint helps in making things explicit (and giving a target for checking status or cancelling), that changing in place don't really provide. So this seems like a good approach.
I suspect the difficulties around cancel have more to do with the specific use case than anything else. ie cancel
might be a valid transition from any given state (though this seems dangerous). If you had a history of transitions, maybe something like undo would be easier to think about or model? Though I suppose that presumes the transition has finished, rather than still being in progress. The in progress part is tricky, as cancel has a race condition against completion (potentially) and also it may be more complicated to undo partial transitions. Do you have a particular use case that needs cancels that you could explain in more detail? I think that might help me to discuss it in more detail. Thanks!
OP
is a Stack Exchange (e.g. Stack Overflow) acronym for "Original Poster".
I suspect the difficulties around cancel have more to do with the specific use case
Great point. This case is hard because the state is not determined by the API (or its database) but by a 3rd party who's sending updates to the API. That's what introduces race conditions. Thus it may be no less complicated over a non-REST API.
Ah ha, thanks for that clarification.
The 3rd party nature does sound like it makes this a lot harder (regardless of the interface you provide). A too-many-cooks-in-the-kitchen kind of problem.
OK. Here's a real case where a clean, RESTful implementation is causing me headaches:
We have tasks. Tasks are self-contained resources -- each contains a JSON "questions" field and a JSON "answers" field. Each entry in the questions field (i.e. list) may be optional or required. The task is completed when the answers field includes answers for all of the required questions (but you can easily imagine a more complicated version with multi-answer validation rules).
The state transition rules are easy so the FSM itself can be data-driven. But what is the most RESTful way for a client to determine if their answers fulfill the completion criteria (and/or why not)? Here's what I've considered so far:
/tasks/{id}
) an envelope that identifies the request as a "validate" request. This is RESTful as POST allows very generic usage, but looks awkward and non-obvious to me.complete = True
(or state = complete
) in a PUT/PATCH request. This flag would not actually change the state in the database, but would guide validation.
answers
is updated, the observer runs (ideally in a transaction), and state
is (indirectly) updated.From a RESTful perspective, I'm most comfortable with (4). From the caller's perspective, it follows the PUT semantics because we would expect a GET (following a success) to return a task with the same completion flag/state. One time (edit: see next comment) this might not happen is if the observer is async. In this case, the behavior is more like "eventual consistency" with PUT semantics.
The most concise way to ask "why isn't this task complete" would be:
PATCH /tasks/{id}
{
"completed": true
}
The more explicit would be a full PUT so the client knows exactly what is being validated.
How does (4) sit with you? Can you think of a more intuitive way that is RESTful?
... another case where the PUT is not perfect (besides async) is when you put completed=false
with a complete answer because the observer will change it. However, this is allowed under the RFC:
there is no guarantee that such a state change will be observable, since the target resource might be acted upon by other user agents in parallel, or might be subject to dynamic processing by the origin server, before any subsequent GET is received.
Sorry for my response delay (travel last week and still playing catchup).
The PUT could perhaps use either 204
to indicate that it is not completed or 200
to indicate that it did in fact validate. I would hesitate to have the output body vary between these two, but perhaps the return would always show something about what is valid or not (and the 200 case would just show them all valid). Alternatively, the PUT could simply return this value and a separate GET could be used to ask for why (though the extra hop is undesirable, it may be semantically clearer). What do you think?
I am about to implement a business process which can be more or less modeled as a state machine. In abstract terms, here is how the state machine would look like
Every state transition is atomic operation. The reason we want to model it as a state machine is that complete operation can take long time to run and we do not want the caller to be blocked. So idea is that we would accept the request from caller and return
202 Accepted
is request looks ok.After that, a scheduled process would pick up the request from database and trigger
process1
by calling a REST endpoint. Same forprocess2
andprocess3
.There are two ways that this can be modeled. First is implement a single REST endpoint like below
Because we know the current state of the entity, we can determine which process to execute next. But then I feel this API is not expressive and following would look better
In the above, there is a distinct endpoint for every process that can be triggered on the entity. If you trigger
process1
on an entity which is infinal state
then you would get back an error.I am not sure which one is right or if both of these approaches are wrong. Anyone has any experience of doing something like this in past?