edgeryders / edgeryders-now-api

ISC License
1 stars 1 forks source link

Migrate this software into the Discourse API #5

Open tanius opened 5 years ago

tanius commented 5 years ago

It seems that this repo provides a caching middleware that collects content from Discourse and provides it in a form as expected by a JavaScript single-page web application running in a browser. For a first version that's alright and it obviously does the job :slightly_smiling_face:

However, each new level of API and of caching is another level of complexity where things can break and by which things become more difficult to manage. For example, in an ideal world the Edgeryders IT core team (i.e. me) must be able to repair everything. Right now, this software seems to be hosted as a free app on Heroku, and I don't have any access credentials for that.

So for the long term, I would rather propose the following architecture for the functions provided by this software right now:

  1. Find a structure / mode of how to write and manage the relevant content inside Discourse so that the number of HTTP requests to gather it is much lower. For example, when different pieces of content such as team member bios are different post to the same topic, then one single request to https://edgeryders.eu/t/<topic_id>.json will yield all of that content.

    In addition, for some special cases where the app needs structured data (JSON) and that should be editable inside Discourse, you might use a JSON code section inside a Discourse topic. I used that technique in this Discourse topic, resulting in this team list. It's an experiment so far, so let's see how well that technique will work in practice …

  2. For some use cases, you can also use requests to access Discourse posts in raw form: https://edgeryders.eu/raw/<topic_id>/<post_number>. These will be faster, and not put any significant load on Discourse. Caching them is not needed as far as Discourse is concerned. However, these requests do not allow to get multiple posts of a topic at once, contrast to the https://edgeryders.eu/t/<topic_id>.json style requests mentioned above. Also they will not be much faster than the JSON style requests, so it's probably not worth to deal with them.

  3. If still necessary to further reduce the number of HTTP requests made, integrate a custom API endpoint right into the Discourse API that provides more of the required content in a single request. We did such Discourse API extensions before. That API endpoint should be generic, for example "provide the post content of all posts tagged …".

owengot commented 5 years ago

@tanius Thanks, good points and will follow up on 1 & 2.

The main idea here is to prepare the API response to work in a way that is intended for the sites, without executing this code on the front end every time the page is visited. If this can be done via a discourse extension, I think it would solve the main issue. The API response on its own contains a lot of extraneous data, so a side benefit of this script is that it parses and organises the data in a more efficient manner.

There are some longer term use cases I have in mind for this, which include providing a call for real time updates from the platform (without x number of visitors hitting the ER server at the same time), a simple JSON store for content not hosted on the platform such as Academy courses, form responses, newsletters and anything else that may come up.

tanius commented 5 years ago

The main idea here is to prepare the API response to work in a way that is intended for the sites, without executing this code on the front end every time the page is visited.

Indeed we could do that with a Discourse API extension. I just don't see a need for it (yet).

When using techniques to limit the number of requests to Discourse, there will be no noticeable speed difference if you do it inside the client-side software. We're not dealing with more text content than can be displayed on one page. (For reducing the number of requests to Discourse, see the first post; I just expanded it there.)

There are some longer term use cases […] [including] a call for real time updates from the platform (without x number of visitors hitting the ER server at the same time)

The Discourse API itself should be well able to handle the expected traffic, esp. when the requests are made to the public Discourse API (= not using an API key) because then all the responses are already completely cached inside Discourse in a Redis database. No need to cache them a second time I think :-)

As you can see I really don't like middleware :D esp. when the use case is possibly premature optimization. Let's optimize for speed and high request numbers when we need that, for sure. Doing so before that results (in my experience) in software that is just more complex than it has to.

a simple JSON store for content not hosted on the platform such as Academy courses, form responses, newsletters and anything else that may come up

I think the same argument applies as for the JSON store for Discourse content: you could access the API of these other applications directly from the SPA running in the browser. Of course you can make yourself a library that abstracts these different interfaces and provides the data in a nicer format to the core code of the application – basically like the interface provided by edgeryders-now-api, but implemented as a library inside the browser.