SuffolkLITLab / courtformsonline.org

Code for courtformsonline.org index of guided interviews. This replaces the "massaccess" repo
https://courtformsonline.org
MIT License
2 stars 0 forks source link

Should this repo be renamed to avoid URL issues with GitHub Pages (edit: and Google)? #29

Closed samglover closed 3 months ago

samglover commented 3 months ago

I realize @jtmst is actively developing this repo, but this is probably related to the launch checklist. Currently this repo is publishing to GitHub Pages as suffolklitlab.org/courtformsonline.org. That's a goofy URL, but it's obviously not where the website will live.

If GitHub Pages is a necessary part of deployment, we might want to rename the repo because we are going to switch to using subdomains, and I don't think courtformsonline.org.suffolklitlab.org is a valid URL. If GitHub Pages is not necessary, it should probably be disabled.

Also pinging @nonprofittechy and @KindBill.

nonprofittechy commented 3 months ago

I am pretty sure it's possible to use GitHub pages without making the domain name the same as the repository name, and that you can control the domain name on a per-repository basis. But this is a good note to keep open for now. @KindBill may have additional thoughts.

I think there's a unique CNAME file we can add to an individual repo.

KindBill commented 3 months ago

Hey @samglover , I discussed with @jtmst , and we came to the same conclusion as what Quinten mentioned. It shouldn't be an issue. That's just what Pages defaults to when no custom domain has been configured.

No need to rename the repo for now.

samglover commented 3 months ago

This is what I'm seeing when I look at the deployments:

From what you're all saying it seems like it shouldn't be publishing to suffolklitlab.org/courtformsonline.org, but it is.

Another reason I'm asking is that this is creating a bit of an SEO mess for the existing website. Google thinks that both massaccess.suffolklitlab.org and courtformsonline.org exist, which means every page on courtformsonline.org has an exact duplicate in Google's index at massaccess.suffolklitlab.org. Fortunately Google hasn't indexed suffolklitlab.org/massaccess, but GitHub Pages still seems to think it is publishing to that URL, so there is at least the potential for triplicates!

It seems like this could be an issue for people who are trying to find these interviews and court forms. Google demotes duplicate content, and the algorithm for how it decides which is the original and which is the copy can be unpredictable.

I can probably fix some of this with redirects, but it will keep cropping up with every new interview we add if we don't find a more permanent fix.

tl;dr: We need to have a single domain available on the public web, and it needs to be courtformsonline.org.

nonprofittechy commented 3 months ago

Once we finish Kind's work on courtformsonline.org repo, we should archive the massaccess repo. It's meant to be a total replacement.

This is supposed to publish to suffolklitlab.org/courtformsonline.org, but only until we're ready to go live with this version of the site. It's going to that URL for now so that we can easily validate changes to the repository that Kind makes without having to setup a local environment.

Once it's ready for production, we'll add the right CNAME file to point this to courtformsonline.org without publishing to any subdomain or path on suffolklitlab.org.

I think we should be able to wait for any changes until this is ready for production, and live with the possible SEO consequences for a few weeks. Either way the right fix shouldn't require a rename of the repo, just a custom configuration for this repo's github pages publishing.