hilbert / hilbert-heartbeat

Heartbeat protocol implementation for Hilbert
Apache License 2.0
0 stars 1 forks source link

Document the API in OpenAPI format #20

Open porst17 opened 6 years ago

porst17 commented 6 years ago

The format is human and machine readable. And https://editor.swagger.io can generate stub client and servers in various programming languages and frameworks.

malex984 commented 6 years ago

nice. shame that we did not use it initially for HB! :-( ps: is it possible to have name-less parameters like intervals there?

porst17 commented 6 years ago

Yes, we should have done that. But I wasn't aware of it. And by properly defining the API, we would have recognized design flaws earlier. But it is like it is. BTW: @elondaits used it to document the presets API in hilbert-ui.

Nameless parameters are called path parameters in OpenAPI 3.

malex984 commented 6 years ago

Path parameter would look like: /hb_ping/<INTERVAL>?appid=<APPID>, right? Is it the same as /hb_ping?<INTERVAL>&appid=<APPID>?

elondaits commented 6 years ago

I don't think it's the best idea to have nameless arguments in the query string...

I wouldn't put it in the path because it's clearly just an argument and not an intrinsic part of a resource.

BTW, I see that all the commands are sent via GET. Init and Done should be sent as POST, because GET should only be used for stateless / idempotent operations that could be cached, resent, etc. The ping should also be POST I think, but it's the one that you can do via GET and not have too many issues with.

malex984 commented 6 years ago

@elondaits Could you please explain the difference between POST vs GET in more details or give some references about that?

elondaits commented 6 years ago

On a regular web page (which is what HTTP was invented for) GET is used for stateless retrieval of resources. There should be no side effects from a GET. That means it's safe for indexers (Google bot, etc.) to crawl via GET, browsers can pre-fetch via GET to accelerate browsing (some plugins do that), etc. You can also cache GET operations (locally or in a proxy cache between the client and server, or through a CDN) because "they should return the same every time" (the page might update, of course... but there should be no harm beyond seeing a slightly outdated version... for instance an online newspaper not showing a news item added a minute ago).

When you need to do an operation that modifies something in the server you normally use POST. That's what's used in contact forms, user registration, login, etc. It can't be cached, replayed, etc. When you try to reload a page that was fetched via POST you'll notice that the browser asks you if it's ok to "resubmit the form", because sending a POST again might have consequences. Same on payment forms when it says "don't click twice, don't reload", etc. to avoid charging you twice.

Some old badly done systems (even large open source projects) used GET for things like content deletion. So you had an administration page with a list of items and a "delete" link next to each that simply called a link such as http://example.com/news/23/delete via GET ... and when something crawled that page it accessed every delete link and deleted everything. I had to work on a system that was like that (a famous open source mailing list software) and once because a misconfiguration the admin panel was left passwordless... it was a "secret address", but Google somehow found it and accidentally deleted all the content.

So basically:

Also in a POST you can send content in the request body (e.g. JSON) while in GET you use the query string, which has a limited length and is less secure because it might get logged in places and is visible.

In an HTTP API the difference between GET and POST is at first sight not as critical because there are no links to accidentally crawl, etc. But the browser or HTTP libs and HTTP proxies might act differently... so you should use them consistently with standards since you never control the whole stack and don't know what might happen. Horror stories always begin with something you hadn't thought of or imagined.

Bonus: HTTP includes other actions that were not used too much in traditional web pages but are used in HTTP APIs (specially REST APIs, which are a sort of standard way of designing HTTP APIs):

Using all the verbs the standard is to use:

.... the idea being you can send GET http://example.com/videos/42 to read a video and DELETE http://example.com/videos/42 to delete it.

elondaits commented 6 years ago

... So for the heartbeat I'd use POST for init and stop of the HB session since they're clearly stateful operations. The ping is more reasonable to do via GET if it makes things simpler for some reason (I don't know if there are performance differences... I'd guess not)... to avoid caches you should add a dummy query string argument with a timestamp that changes every call, or some other form of incrementing ID. Replaying a single heartbeat (via GET) has no terrible consequences, so it's OK.

porst17 commented 6 years ago

@malex984 You are right regarding the nameless query parameters. I don't think they are supported in OpenAPI (for good reason).

@elondaits I am aware of the things you pointed out. There is no good technical reason for the current protocol specification. The protocol evolved over time from just sending a ping, that only had a timeout and no side effects on the server, into the API we currently have, with different commands and the application ID. It's a simple thing to change the protocol, but we would need to update all implementations as well including not only our own but also third-party code. Given the time left until release, I wouldn't spend time on this right now. The current implementations seem to work OK, so I'd rather stick with the current API for v1.0.0 and postpone modifications of the protocol to a v2.0.0 release.

I'll will migrate your comments into a new issue about redesigning the protocol.