Define a convention on TCP/UDP ports used by development stacks

benoit74 commented 10 months ago

Background

In many of our projects, we deploy local development servers (e.g., an API and a Database) on our development machines for testing purposes. These servers expose a TCP (occasionally UDP) port on our local machine. Currently, there is no standardized convention for the usage of these TCP/UDP ports across projects. For instance, some projects use port 8000 for web APIs, while others use 8080.

Note: This intentionally simplifies the distinction between TCP and UDP ports and assumes we don't want two distinct services, one on TCP and one on UDP, running on the same port number. Although technically possible, it's deemed cumbersome for our purposes.

Problem Statement

The absence of a convention on TCP/UDP port assignments for local development services leads to two issues:

After starting a local development stack, it's unclear where the services are listening, causing delays when switching between projects.
- This becomes more pronounced with the shift to docker-compose-based local dev stacks, initiated with a simple docker compose up -d.
Running two local development stacks simultaneously is usually impossible due to port conflicts.
- This often occurs when transitioning from developing project A to reviewing project B.

Proposition

We can address the problem by establishing a convention for TCP/UDP port assignments.

The proposed convention is to use port XXXY for every server in our systems, where:

Y is a number indicating the type of service:
- UI is always on Y=0
- Backend server (+/- API) is on Y=1
- Database is on Y=2
- Y=3 to 5 are reserved for potential generic usage
- Y=6 to 9 are available for non-generic services (e.g., a second backend server)
XXX is a number reserved per project (Github repository)
- Each Github repository will reserve a number in a centralized reference.
- Repositories may reserve multiple numbers if needed, and these numbers are contiguous. If the need wasn't anticipated, the project is moved to other contiguous numbers.
- To determine where XXX starts, we need a broad port range to accommodate all our projects. Since we don't have many other services running on our development machines and the TCP/UDP port ranges are cluttered with various services, we can use any meaningful port range for these assignments, reserving some numbers for external services if conflicts arise.
- XXX will hence start at 800, with the 800 and 808 ranges already reserved due to known conflicts with many of our (not yet migrated) projects and other web servers.

Feedback and implementation

All feedbacks are welcomed, after that I will transition this to a Wiki entry.

rgaudin commented 10 months ago

LGTM ; I suggest we blindly assign an XXX for every repo in our 3 orgs in chronological order and put that in the Wiki. Some repos could be assigned 2 or 3 slots right away (offspot/container-images for instance)

mgautierfr commented 10 months ago

I would go even further and assign an XX for every repo. It will reserve 100 slots (XX00 to XX99) to every repo which should be greatly enough. Even if we use the unit for the type of the service, we still have 10 slots for this specific service of this specific project.

rgaudin commented 10 months ago

I would go even further and assign an XX for every repo.

I support this

benoit74 commented 10 months ago

I would go even further and assign an XX for every repo.

I like the idea for it's "simplicity" (no need to reassign ranges should more ports be needed, easier to remember two digits per repo than three) but I'm a bit worried by the fact that this will consume a lot of space / increase the chance of collision with another service.

Or should we jump directly to 3XXYY where there is way less risk of collision according to https://en.wikipedia.org/wiki/List_of_TCP_and_UDP_port_numbers and the "fixed" 3 is easy to remember?

rgaudin commented 10 months ago

I don't mind. For instance my router (and @kelson42's but he doesn't know yet) makes regular requests on WLAN on port 8080. This creates annoying requests in dev projects. We can go above 30K AFAIC.

benoit74 commented 10 months ago

We can go above 30K AFAIC.

I think so as well. They are used as ephemeral ports but I don't think we should mind, the system won't reallocate a port used by a service as ephemeral one, and the chance that a port we need is used as ephemeral port at the precise moment where we need it are quite low (and should a conflict occur, chances are high we can just retry and it will work because the TCP connection has been dropped). At least we can try, and should it causes issues, we will easily change to another strategy.

benoit74 commented 10 months ago

Could you please help me fill the dots in table below? And check what I've already "decided".

I sorted the list alphabetically for convenience, and I propose we assign port ranges in alphabetical order for now. And then in chronological order for new projects. Except if it is easy for you to sort the list in chronological order, but it is not for me ^^

I have chosen to not assign a port to new Vue.JS-based scrapers like freecodecamp and kolibri (soon), because even if we sometimes start a yarn dev, it is the single process we start and not a whole stack, so I do not think we should mind about these.

Organization	Repository	Port range needed
kiwix	.github	?
kiwix	apple	N
kiwix	borg-backup	N
kiwix	container-images	?
kiwix	ipfs-portal	?
kiwix	java-libkiwix	N
kiwix	k8s	N
kiwix	kiwix-android	N
kiwix	kiwix-android-custom	N
kiwix	kiwix-android-nightly	?
kiwix	kiwix-build	N
kiwix	kiwix-desktop	N
kiwix	kiwix-js	N
kiwix	kiwix-js-pwa	N
kiwix	kiwix-tools	?
kiwix	libkiwix	?
kiwix	metrics	?
kiwix	mirrorbrain	Y
kiwix	overview	N
kiwix	web	?
offspot	base-image	?
offspot	captive-portal	?
offspot	cardshop	Y
offspot	container-images	Y
offspot	content-filter	?
offspot	dashboard	?
offspot	docker-export	N
offspot	edupi	Y
offspot	image-creator	?
offspot	imager-desktop	?
offspot	kiwix-hotspot	?
offspot	kiwix-plug	?
offspot	mediawiki-docker	?
offspot	metrics	Y
offspot	offspot-config	N
offspot	operations	?
offspot	package-requests	N
offspot	testbench	?
offspot	wikifundi	?
openzim	_python-bootstrap	Y
openzim	cms	Y
openzim	docker-publish-action	N
openzim	dwds	N
openzim	freecodecamp	N
openzim	gutenberg	N
openzim	ifixit	N
openzim	javascript-libzim	?
openzim	kolibri	N
openzim	librechef	N
openzim	libzim	?
openzim	lilote	N
openzim	mwoffliner	N
openzim	nautilus	N
openzim	nautilus-webui	Y
openzim	node-libzim	N
openzim	openedx	N
openzim	overview	N
openzim	phet	N
openzim	python-libzim	?
openzim	python-scraperlib	N
openzim	python-storagelib	N
openzim	sotoki	N
openzim	ted	N
openzim	warc2zim	N
openzim	wikihow	N
openzim	wombat	N
openzim	wp1	Y
openzim	wp1_selection_tools	N
openzim	youtube	N
openzim	zim-requests	N
openzim	zim-testing-suite	N
openzim	zim-tools	?
openzim	zimfarm	Y
openzim	zimit	N
openzim	zimit-frontend	Y

rgaudin commented 10 months ago

Unless there is a reason not to, I reiterate my advise to blindly assign a range to all repos. Many of them won't need them but nobody wants to go through all that list and wonder if it should or if it may need it in the future.

As per sorting, creation date (as well as sequential ID) is available via the API

❯ curl -s https://api.github.com/repos/kiwix/apple | jq '.created_at, .id'
"2015-08-12T19:05:29Z"
40619002

benoit74 commented 10 months ago

Sorry, I slipped through this suggestion. Make sense to me.

benoit74 commented 10 months ago

Please review https://github.com/openzim/overview/wiki/TCP-UDP-ports-for-development

rgaudin commented 10 months ago

LGTM 👍

kelson42 commented 10 months ago

In many of our projects, we deploy local development servers (e.g., an API and a Database) on our development machines for testing purposes. These servers expose a TCP (occasionally UDP) port on our local machine. Currently, there is no standardized convention for the usage of these TCP/UDP ports across projects. For instance, some projects use port 8000 for web APIs, while others use 8080.

The key point here is the principle of less astonishement and then coherency.

I remark that the question of sockets is not treated here, although it should be preferred - in production - for all internal services IMO.

Problem Statement

The absence of a convention on TCP/UDP port assignments for local development services leads to two issues:
* After starting a local development stack, it's unclear where the services are listening, causing delays when switching between projects.

This is a general problem if you work on many projects at the same time, this is far broader than a kiwix/openzim problem!

  * This becomes more pronounced with the shift to docker-compose-based local dev stacks, initiated with a simple `docker compose up -d`.

* Running two local development stacks simultaneously is usually impossible due to port conflicts.

  * This often occurs when transitioning from developing project A to reviewing project B.

I'm not in favour of this kind of exotic approach where usual ports are not used, things should be simple and usual ports should be used. There must be an other solution like:

Network isolation
Virtualisation
Ability to easily stop/start whole infra for a project

rgaudin commented 10 months ago

I remark that the question of sockets is not treated here, although it should be preferred - in production - for all internal services IMO.

This ticket has nothing to do with production.
Sockets are orthogonal to k8s.

This is a general problem if you work on many projects at the same time, this is far broader than a kiwix/openzim problem!

I know you know the difference between theory and practice already 😉 And you have your share of responsibility in this I believe.

I'm not in favour of this kind of exotic approach where usual ports are not used, things should be simple and usual ports should be used

Why do you care exactly? it's just a convention that has no impact on software. On a projects without a dev compose, it's a guideline, on those with such a dev compose, it's usage of ports X, Y and Z instead of A, B and C 🤷‍♂️

benoit74 commented 10 months ago

I can only second what Renaud said.

I would just add that:

network isolation is already in place, but for dev purposes, you have to expose ports on the local developer machine (at least it is the simplest / most common / straightforward way to access the stack to do tests)
virtualization is usually (because not all projects already adopted it) already in place with usage of Docker even for development
ability to easily stop/start whole infra for a project is usually (because not all projects already adopted it) already in place with Docker compose stacks for development

Or maybe I just don't get what you mean by these principles.

benoit74 commented 6 months ago

@kelson42 Any chance we can move this forward?

Again, this has only to do with development stacks, where:

we miss a convention, have no coherency and lot's of astonishment, so we MUST move this forward quickly
local stacks needed by a developer frequently needs at least 3 ports (web UI, API and DB) and sometimes even more (e.g. zimit frontend all needs a zimfarm API, preferably its UI and its DB) ; we have a huge variety of DB in place and each one has its own conventional port to be used

AFAIK, 3 out of 4 core developers are still happy and in-demand with a concrete proposition synthesized few comments before, and three months later I do not see a concrete proposition of one alternative, but only vague concepts not rooted with the developers reality.

kelson42 commented 6 months ago

We had a discussion at that time with @rgaudin and the conclusion was that we should look first why container engine network capabilities don't allow easily to isolate systems so they can run in parallel without having all the services conflicting with each other. @rgaudin Do I remember properly?

rgaudin commented 6 months ago

@rgaudin Do I remember properly?

Yes and I reported in one of the weekly that it wasn't conclusive. I didn't reply here because I didn't gather evidence and it was too much of an effort to do so: advanced network usage are complicated and poorly documented but it what's sure is that it wouldn't be effortless so even if there's a possibility it would defy the purpose of simplicity/transparency/expectability expressed it.

benoit74 commented 6 months ago

why container engine network capabilities don't allow easily to isolate systems so they can run in parallel without having all the services conflicting with each other

By default (and this is the case in our configurations), docker compose creates one network per compose stack, and they run nicely and in isolation without issue.

The problem is that we want to expose some IP/ports on these networks on the local developer machine. And here, on the local developer machine, we have conflicts of IP/port.

One solution would be to use DNS names and reverse proxy and other stuff to expose one single "thing" and use the DNS name to redirect appropriately, but I'm not even sure it could work for all protocols and is even more complex/expectable since developer would probably have to tweak its DNS configuration for it to work.

openzim / overview