Closed yaronkoren closed 8 months ago
This is already done in Taqasta, can we keep it close to each other?
We definitely should. Can you create a PR?
I looked through the Taqasta implementation of this. The main script there is still called run-apache.sh, but a lot of its functionality was moved to a separate script, run-maintenance-scripts.sh:
https://github.com/WikiTeq/Taqasta/blob/master/_sources/scripts/run-maintenance-scripts.sh
Looking through the code, it looks like the Taqasta equivalent includes at least the following additional functionality:
At least some of this seems useful for Canasta - but integrating this into Canasta will require some discussion; it can't just be done with one big PR. And I still think run-apache.sh should be renamed to something more general. So for now, I think the PR I created is the better course of action. Though it's good to know about all the Taqasta stuff.
Looks cool!
Support for SQLite
Out of scope for Canasta. We shouldn't be supporting SQLite - we have no need for something with less support by the WMF than MySQL.
Calling of the Monit script, to (I think) support a Slack client
This is probably WikiTeq-specific.
Conversely, the symlinking of extensions and skins is not there?
Yeah, this is a Canasta-specific approach that is necessary for maximum flexibility. I don't think WikiTeq developers shared the same enthusiasm for this idea, which is understandable and not a problem. There's more than one way to skin a cat.
And I still think run-apache.sh should be renamed to something more general.
Completely agreed.
Here are some ideas for how to simplify run-apache.sh. I submitted an issue to USAF Iron Bank on the hardened Canasta issues I was having and they fixed them last week. They were able to move things out of run-apache.sh and into the Iron Bank Dockerfile where they could be run as root.
I'll add some details and links below.
Link to the issue I wrote for them: https://repo1.dso.mil/dsop/opensource/canastawiki/canasta/-/issues/88#note_1145602
run-apache.sh mods here and here
php maintenance/update.php --quick
. Is there a downside to this?Let me know if you see any issues with these mods. At least we can use these for reference, or if we like them then pull in as they did them.
@olsonjaredm - thank you, this is very helpful.
The "Map host for VisualEditor" code was never in Canasta, I don't think - it looks like custom code that you (collectively) added to run-apache.sh, then decided was better in the Dockerfile. If possible, could you create a patch for it?
I don't know why all of the MediaWiki-related scripts are called via runuser, so that they can be called as "$WWW_USER". What, indeed, is the reason for this - does anyone know?
Finally, the rsync command is indeed part of run-apache.sh - it's step 2 in my initial listing. That seems like a reasonable move also - Jared, if you don't mind, could you also create a patch for that? (Assuming you know how to make a multi-file patch.)
By the way, one thought I had for how to simplify this whole thing, and make it more customizable, is to create a new directory that would hold a set of shell scripts to be called after update.php is run, i.e. after we know the database has been fully set up. The directory might be called custom-scripts/ or final-setup/ or something else. It could include mwjobrunner.sh, mwtranscoder.sh and mwsitemapgen.sh (original steps 8-10), plus the SMW and CirrusSearch setup scripts from Taqasta (assuming those make sense to add in) - and then spinoff images could easily add their own additional scripts to this directory - like Taqasta's CLDR and Monit calls.
By the way, one thought I had for how to simplify this whole thing, and make it more customizable, is to create a new directory that would hold a set of shell scripts to be called after update.php is run, i.e. after we know the database has been fully set up. The directory might be called custom-scripts/ or final-setup/ or something else. It could include mwjobrunner.sh, mwtranscoder.sh and mwsitemapgen.sh (original steps 8-10), plus the SMW and CirrusSearch setup scripts from Taqasta (assuming those make sense to add in) - and then spinoff images could easily add their own additional scripts to this directory - like Taqasta's CLDR and Monit calls.
100% agreed. Perhaps we should do that first before we incorporate the Iron Bank changes into vanilla Canasta?
I'm glad there's agreement on this "automated script directory" idea! (By the way, maybe it should be called maintenance-scripts/ - to borrow the wording from Taqasta.) But does the timing of the changes matter? I ask only because making at least those two Iron Bank changes discussed before - for "map host" and rsync - seems quite easy, whereas adding this feature would be at least moderately difficult.
Right now the run-apache.sh script, contrary to its rather specific name, does a bunch of things:
At the very least, I think this script should be renamed to something indicating that it's a sort of master do-everything kind of script: setup-all.sh, setup.sh, something like that. It also could be good if any of steps 1, 2 and 4 could be turned into their own scripts, to try to break up the code more. Step 1 takes about 40 lines, step 2 takes about 10 lines, and step 4 takes about 20 lines; so step 1 would be the most obvious candidate for splitting off, into a script called create-symlinks.sh or something.