jquery / infrastructure-puppet

Puppet configuration for jQuery Infrastructure servers.
MIT License
7 stars 9 forks source link

Make Trac read-only, possibly replace by a static dump #10

Closed mgol closed 11 months ago

mgol commented 3 years ago

A while ago, we made jQuery Core Trac instance (https://bugs.jquery.com/) read-only, making the only still fully functional Trac instance to be the jQuery UI one (https://bugs.jqueryui.com/). With jQuery UI being maintained in a very limited way nowadays, it doesn't make sense to maintain a full separate bug tracker just for this project; other projects are using GitHub issues.

We want to enable GitHub issues for jQuery UI and make the UI Trac read-only. In the future, we can consider replacing Trac with a static dump of all its pages if that makes maintenance easier.

I'm not 100% sure what we did for Core but I think we mostly blocked account registration and removed all existing accounts as with no accounts the site is essentially read-only. @rjollos can you help with doing the same for UI? I'm not sure if it was you or someone else involved with the changes for the Core Trac.

brianwarner commented 3 years ago

I pulled a static dump of the UI trac a few times, and it's many gigs of data... I think probably because it's grabbing multiple copies of pages to accommodate sorting. Anybody have any suggestions for gracefully dealing with this?

rjollos commented 3 years ago

If we just want to make it read-only it's straightforward to setup the permissions like we've done with the jQuery site. I don't know how the static dump works, but I'm guessing you might also be getting every version of wiki page history.

brianwarner commented 3 years ago

That's very possible, I was just crawling it with wget. I think the ideal state would be something that's completely static so we could just upload it and not worry about maintenance, but in the meantime setting the permissions seems like the right step.

On Mon, Jul 19, 2021 at 1:44 PM Ryan J Ollos @.***> wrote:

If we just want to make it read-only it's straightforward to setup the permissions like we've done with the jQuery site. I don't know how the static dump works, but I'm guessing you might also be getting every version of wiki page history.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jquery/infrastructure-puppet/issues/10, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAOVQJTAU3JC7732SN6PGODTYRQAXANCNFSM477GDFNQ .

mgol commented 3 years ago

@rjollos Let's make it read-only for now and we can think about possible next steps later. BTW, I couldn't find any changes in infra source that would be responsible for making Core Trac read-only, how is it done?

rjollos commented 3 years ago

I couldn't find any changes in infra source that would be responsible for making Core Trac read-only, how is it done

Permissions are stored in the database and can be modified through the web admin (https://bugs.jqueryui.com/admin/general/perm) or the trac-admin command line utility.

I'll make the necessary changes soon.

Krinkle commented 3 years ago

That's very possible, I was just crawling it with wget. I think the ideal state would be something that's completely static so we could just upload it and not worry about maintenance, […]

+1. Would you happen to still have this archive around? A breakdown of the some of the directory structure and their sizes could be handy. I suspect that a large majority of the space might be in subpaths we could discard from the archive (either through params to wget, or by removing afterwards). Useful non-bug content is likely either obsolete or can be redirected elsewhere with Apache rules, so we might only have to serve a select few paths, such as /ticket/:number (which are generally linked from commits, old discussions, and between tickets), and the "useful" things links from there, such as /attachment/* and /raw-attachment/* (plus a handful of generic subresources, like the shared stylesheet for the site). This is how we handled the archiving of Bugzilla at Wikimedia back in 2014 (e.g. https://static-bugzilla.wikimedia.org/).

brianwarner commented 3 years ago

Sure do, here's the smallest one. Different options ranged from 2.7GB to a hilariously large 192 GB. Links should be converted and prereqs fetched so it'll resolve offline, though I haven't tested that rigorously.

https://drive.google.com/file/d/1L-_BKvt64yVL1S0JuKP-BKEc9PaYjblC/view?usp=sharing

mgol commented 3 years ago

@rjollos Hey, where are we with making the UI Trac read-only? We're nearing towards a release and it'd be good to have it done before that happens.

rjollos commented 3 years ago

I revoked permissions for authenticated users and only the users in the admins and contributors group will be able to create tickets now. Someone may want to edit the wiki landing page to instruct users to report bugs at GitHub.

Krinkle commented 3 years ago

Thanks @rjollos, and for maintaining Trac all these years. ♥️

mgol commented 3 years ago

I've been thinking: could we redirect https://bugs.jqueryui.com/newticket to https://github.com/jquery/jquery-ui/issues/new and https://bugs.jquery.com/newticket to https://github.com/jquery/jquery/issues/new? There are links to those legacy URLs in a various places.

rjollos commented 3 years ago

Yeah, I think those redirects could be added. I can take a look at it on Monday.

rjollos commented 3 years ago

I added the two redirects by editing /etc/nginx/conf.d/vhost_autogen.conf. I wasn't able to confirm the puppet changes are correct because:

root@trac:~# puppet agent --test
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Duplicate declaration: Package[git] is already declared in file /etc/puppet/manifests/nodes/default.pp:34; cannot redeclare at /etc/puppet/manifests/nodes/trac.pp:42 on node trac.ops.jquery.net
Warning: Not using cache on failed catalog
Error: Could not retrieve catalog; skipping run

It seems straightforward enough, except:

root@trac:~# ls /etc/puppet/manifests/nodes/default.pp
ls: cannot access /etc/puppet/manifests/nodes/default.pp: No such file or directory
root@trac:~# ls  /etc/puppet/manifests/nodes/trac.pp
ls: cannot access /etc/puppet/manifests/nodes/trac.pp: No such file or directory

The New Ticket navigation item is now shown for anonymous users.

Krinkle commented 1 year ago

@timmywil expressed interest in helping with this. Let me know if you need any help with access.

To access the current Trac server, send a PR adding your ssh key to /modules/user/manifests/virtual.pp and /modules/user/manifests/sysadmins.pp against this (private) infra repo.

The provision of the new static server would take place in public at https://github.com/jquery/infrastructure-puppet, but Taavi and myself can help with that if needed. I suggest publishing the export to a Git repo, that we can then sling up onto a static server.

timmywil commented 1 year ago

Sites are ready to deploy at https://github.com/jquery/bugs.jquery.com and https://github.com/jquery/bugs.jqueryui.com. Note that some redirects will need to be deactivated.

Krinkle commented 1 year ago

Note that some redirects will need to be deactivated.

@timmywil Which ones? I propose to do bugs.jqueryui.com first so let me know which redirects needs to be removed.

Sites are ready to deploy at https://github.com/jquery/bugs.jquery.com and https://github.com/jquery/bugs.jqueryui.com.

Could you update the deploy job to push the result to a branch? See qunitjs.com workflow for example. This way we have a copy of the static site HTML permanently, rather than dependent on current OS/Node/npm packages working exactly in the future to reproduce the site. It also has as a bonus the ability to diff changes (if any) in the future, and means I can securely deploy as a simple static site that pulls from a public repo (no credentials, no push) without executing any code on the server.

I will include a webhook the org-wide jquery hook since it's easy to do, so that updates are immediately live, same as GitHub Pages.

timmywil commented 1 year ago

Which ones?

Any and all related to these sites. I don't think I have access to see the list, but redirects will no longer be necessary. I know that https://bugs.jquery.com/newticket currently redirects to core github (well, it used to. It's actually broken now and goes to a 404 page). That will no longer be necessary. It may just be /newticket on both sites, but it'd be good to double check.

Could you update the deploy job to push the result to a branch?

Will do

Krinkle commented 1 year ago

Which ones?

Any and all related to these sites. I don't think I have access to see the list, but redirects will no longer be necessary. I know that https://bugs.jquery.com/newticket currently redirects to core github (well, it used to. It's actually broken now and goes to a 404 page). That will no longer be necessary. It may just be /newticket on both sites, but it'd be good to double check.

In Cloudflare for bugs.jqueryui.com, I see no redirect rules.

On the old Trac server, it has a local Nginx proxy configured in the private repo at https://github.com/jquery/infrastructure/blob/33557055ee/modules/jquery/templates/nginx/trac.conf.erb#L35. That seems to define /newticket as the only redirect.

In any event, if we're only worried about disabling redirects, then we don't have to look any further. When we switch the domain to a new server serving a static site, the old server and any redirects it may have naturally go away from the public POV.

timmywil commented 1 year ago

Trac sites are now being deployed to branches. I made the repos public. I was just cautious when I first started, but everything in there is public already.

Edit: The workflow would require new SSH keys in order to clone the private repo in GH actions. I started doing that and then realized they should just be public.

Krinkle commented 1 year ago

@timmywil I've switched over bugs.jqueryui.com. It looks like pagefind.js is missing from the latest few builds. This wasn't visible on the github.io preview because that was still sourced from an earlier commit, based on the now-non-existant "action".

To confirm whether bugs.jquery.com is affected as well, I've updated the Pages settings at https://github.com/jquery/bugs.jquery.com/ from Action to Branch: gh-pages. The latest version does appear to be working fine, there. Something appears to be up with the jqueryui archive.

No rush. We'll catch up when you're back in two weeks :)

Krinkle commented 11 months ago

@timmywil @rjollos I've powered off the trac.ops.jquery.net droplet. I'll give it two weeks before deleting. Let me know if you'd like it back on for any thing!

rjollos commented 11 months ago

@timmywil @rjollos I've powered off the trac.ops.jquery.net droplet. I'll give it two weeks before deleting. Let me know if you'd like it back on for any thing!

Thanks, that shouldn't be a problem.

timmywil commented 11 months ago

pagefind issue on the ui site should be fixed next deployment. Turns out I forgot to update pagefind.

timmywil commented 11 months ago

Looks fixed. Can probably close this ticket now.

Krinkle commented 11 months ago

I haven't deleted the droplet and Tarsnap backups yet.

Krinkle commented 11 months ago