Open tlvu opened 2 years ago
@dbyrns @huard @fmigneault @matprov updated issue description to create one big component to not have to deal with component dependencies, as per our discussion.
@fmigneault currently Mongodb is used by Phoenix. Weaver also use the same Mongodb instance, which would pull Phoenix and Mongodb into this big component for Weaver to depend on.
Would you be open for the Weaver component to have its own Mongodb and then the current Mongodb and Phoenix can be moved into their own standalone component?
No.
I would rather keep the same mongodb instance and have a note in the component README that says it is required when either Phoenix or Weaver are employed.
We could have each component extend EXTRA_CONF_DIRS
as needed with their own component dependencies in case they were omitted if we want to be more robust.
Since data and tables are already being created in the same mongodb, they will not be easily migrated into distinct instances.
Same issue goes for PostgreSQL.
Since data and tables are already being created in the same mongodb, they will not be easily migrated into distinct instances.
You got a point here. To avoid complex migration procedure, I guess we are stuck with a fairly big "default" component.
But @tlvu, I was thinking that the solution proposed by @fmigneault was accepted. The one that every component is isolated but we propose a hand-crafted component list that we know is working. 100% back-compatible, but still allow new stack to emerge without Phoenix or without Weaver as long as if one of them is used Mongodb must be used.
But @tlvu, I was thinking that the solution proposed by @fmigneault was accepted. The one that every component is isolated but we propose a hand-crafted component list that we know is working. 100% back-compatible, but still allow new stack to emerge without Phoenix or without Weaver as long as if one of them is used Mongodb must be used.
@dbyrns Yes, so this "hand-crafted component list" will basically include everything currently is deployed by default to keep 100% backward-compat.
Inside that "hand-crafted component list" there will be a subset that has to absolutely go together (all the birds + postgres + Magpie because Weaver and Magpie currently hardcode the list of birds and because all the birds have their existing data in the same postgres, breaking them out means postgres data migration for each bird). I was trying to make this subset as small as possible. So code wise, it is doable, but not for data migration as @fmigneault point out. So this subset will have to stay pretty large.
In my opinion, birds can be separated (eg: component/hummingbird
, component/flyingpigeon
as so on), it's just that all of those will require component/postgres
.
Similarly, component/phoenix
and component/weaver
will require component/mongodb
.
Weaver might need a small update to auto-populate the WPS birds list based on enabled components, but nothing more.
The "default setup" will include all currently active components.
Users can then decide to override this default setup to remove some components as desired, but its up to them to make sure for example that component/postgres
is still provided for birds that need it.
This should be fairly easy to debug, since docker-compose would complain of missing link
or depends
service if a required component was omitted.
@tlvu
I came across a use-case where I might like to have a more recent version of mongodb
for Weaver (using 3.4 from available one on server is quite old, 5.0 is now available, 3.x is not even supported anymore...).
So if you get around this task, you can consider renaming mongodb
to phoenix-mongodb
(left as is), and I would add a more recent weaver-mongodb
.
each service should have its own user to query the database with limited table/collections accessible.
(from https://github.com/bird-house/birdhouse-deploy/pull/296#issuecomment-1441085287)
~The database in this case refers to the db containing the magpie data. We could even use magpie/twitcher itself to enforce these policies and provide this data through magpie's API~
magpie/twitcher itself to enforce these policies and provide this data through magpie's API
I don't think this is feasible. They are strongly expecting HTTP requests. Either way, I don't think we should mix the concepts of "platform users" and "service users". In https://github.com/bird-house/birdhouse-deploy/pull/296#issuecomment-1441085287, I referred to "users" in the sense of the postgresql/mongodb credentials to connect to the databases. What I would expect is that a query from e.g. finch doesn't allow reading magpie's database and vice-versa.
@fmigneault
I think I misunderstood your original comment. Multiple services should be able to access the shared database that hosts their own data not that multiple services should be able to access magpie's database. Is that correct?
In that case, I think that we should make a distinction between a database service and a database:
For example, a postgresql cluster is running as a service in a container but we can define multiple databases run by that service, each one accessible by a different postgres user. So magpie, finch, etc. can all use the same postgres service but then we have a database named "magpie" which is accessible by the user "magpie", the database "finch" which is accessible by the user "finch", etc.
Is that what you mean?
@mishaschwartz Yes, this is what I had in mind. Only one docker service, but each “bird” have their own database and user in it.
There are interdependences between the birds and other components, ex:
So in order to not solve this interdependencies right now, all the birds and magpie/twitcher and postgres will be in one big component for now.