hackforla / website

Hack for LA's website
https://www.hackforla.org
GNU General Public License v2.0
315 stars 754 forks source link

ER: Unexpected Website Deployment in Forked Repo #6689

Open tony1ee opened 4 months ago

tony1ee commented 4 months ago

Emergent Requirement - Problem

Forked Repo of hackforla/website is unexpectedly deployed under the individual's <github-handle>.github.io domain.

Further looking into the problem needed to locate the exact reason of this unexpected deployment. Preliminary research suggests this happens without any related action taken by the developer and is happening to GitHub accounts without Github Pages set up prior to forking the repo.

Update: further testing suggests the website repo will be deployed every time the fork is synced with upstream, possibly by dynamically triggered workflow pages-build-deployment

Issue you discovered this emergent requirement in

Date discovered

04/18/2024 See link to slack discussion below in Resources

Did you have to do something temporarily

See Potential solutions [draft] for temporary solution

Who was involved

@tony1ee

What happens if this is not addressed

Resources

Since the testing are done using the docker image locally and comparison is to the hackforla.org website, is there any use for the \.github.io/website deployment? Though it might cause some conflicts for someone who already has a site at \.github.io. 15 replies

Elliot Kim Where does it say to deploy to your github.io? (edited)

Tony Li I think it’s automatic. for example, for your account, it’s already deployed at https://elliot-d-kim.github.io/hfla-website/

Elliot Kim Thanks for pointing that out, I didn't know! I agree this could potentially cause issues. Also sent to the channel

Tony Li looks like you can go to the forked repo then settings → pages to find the button to unpublish,

Fang @ Tony Li I'm just commenting on how I see things and this is not the guideline of the website project. I don't think it will cause conflicts unless their \/\ project also has a website folder. This is how github allows users to publish sites on ghpages: \.github.io/\ . I think it's up to the individual github user to handle such conflicts. Maybe it's good to mention the potential for conflict in case people aren't aware. The \.github.io/website deployment is a copy of the hackforla website at the point when you updated your gh-pages branch, which may or may not be the same as the current website. So I guess it can serve as a reference website for your current feature branch changes while you're working on it. The current website might have further changes from other devs. If you're at the point of wanting to do a PR, then you should pull from and also compare against the current website. I'm just saying there might be a beneficial use for the \.github.io/website deployment.

Tony Li @ Fang Thanks for joining the discussion. Keep in mind that despite the home page is not every resource in the website is \.github.io/website , not all linked resource is under /website.(e.g. /join for joining) Since both the domain and the root of the website is be different, this deployment of the website can have unexpected errors (e.g. the sponsor logos not displaying because path is 404), so I don’t think it could be used for a reference. Especially when you already have a personal ghpages website deployed, this can cause broken links and even SEO issues. Another issue potentially for HackForLA is these different versions of developer websites, some potentially with outdated or inaccurate info (when behind hackforla/website).

Fang @ Tony Li I see your point. You're thinking of protecting the user's ghpages deployment from the project. That does make sense. I was thinking from the hackforla project's point of view. The outdated deployment is not really an issue since the real website is on its own .org domain rather than github.io I'm not familiar with the sponsor logo paths so you have better info on that. :+1: 1

Tony Li @ Roslyn Wythe this needs some attention

Roslyn Wythe @ Tony Li @ Fang At one time we considered instructing developers how to use GitHub pages on their forks and PR branches, to do testing as an alternative to docker, and we found a way to fix the problem with the broken paths, but we decided against making us of it because we didn't want various version of the website publicly available on the internet, even on a domain other than hackforla.org. :ok_hand: 2

Roslyn Wythe @ Elliot Kim did you say that you hadn't enabled GitHub pages on the forked repository? I would be surprised if it was enabled by default, but I guess that is possible

Elliot Kim Yes, I don't remember enabling it at any point.

Elliot Kim I just went through some Pre-work Checklists and many (not all) of the recent people to join have the HfLA website on their GitHub pages too, so it doesn't appear to be an isolated incident. (edited)

Roslyn Wythe OK thank you all. I didn't realize the situation was widespread! I suppose we could instruct developers to disable GitHub pages on their fork of the website repository.
@ Tony Li Would you be willing to write an ER describing this situation?

Tony Li sure! happy to!

Tony Li ER #6689 drafted @ Roslyn Wythe

Recommended Action Items

Potential solutions [draft]

For current contributors, we can instruct them to:

For future contributors, we can:

For inactive developers:

tony1ee commented 4 months ago

Please advise on applying proper labels to this ER, thank you.

elliot-d-kim commented 4 months ago

The onboarding documentation could be updated to prevent future instances, but what about current developers? Could there be a change that developers could "pull"? And what about past developers?

tony1ee commented 4 months ago

@elliot-d-kim since the deployment is for the forked repo, I think we need to inform all developers who have forked the website repo of this issue and the steps to unpublish. I do not see a way for us to control their individual repo settings.

tony1ee commented 4 months ago

Update: looks like the temporary fix won't work as initially suggested, since every time the fork is synced with upstream, the site gets redeployed again, seemingly by a workflow triggered automatically.

tony1ee commented 4 months ago

Another possible solution could be to instruct developers to turn off GHA in their workflow, but this would turn off pages-build-deploy as well as disabling all other workflows from running, so unsure whether this is an acceptable solution.

image
tony1ee commented 4 months ago

Update: looks like the temporary fix won't work as initially suggested, since every time the fork is synced with upstream, the site gets redeployed again, seemingly by a workflow triggered automatically.

Looks like this is expected behavior for GitHub Pages: Per GitHub Pages Docs,

A successful workflow run in the repository for your site will create a new deployment. Trigger a workflow run to redeploy your site.

t-will-gillis commented 4 months ago

Hi @tony1ee - Just starting to look at this but as a first step, you reference a Slack conversation above: please copy that convo to this issue. Slack conversations are lost after a few months so we can't rely on Slack as a record.

tony1ee commented 4 months ago

Based on hackforla/website repo's fork data, there are approx. 694 forks.

many, but not all, of the forks have deployed sites.

tony1ee commented 4 months ago

Proposing a possible temporary solution as mentioned at Deleting your site by changing the source. Updating it in main comment ...

elliot-d-kim commented 4 months ago

I'm currently exploring custom deployment options.

@t-will-gillis mentioned to me that if: github.repository == 'hackforla/website' can prevent workflows from running on forks. This use of if statements is already present on the main branch in workflows like Schedule Daily 1100 and Update VRMS Data.

However, the pages-build-deployment workflow in its current configuration does not have a workflow file. It's a "default" workflow from GitHub (From the repo, Settings > Pages > under Build and Deployment, Source > Deploy from a branch).

Switching from 'Deploy from a branch' to the 'GitHub Actions' option would provide us with a selection of template workflow files, including one for deploying Jekyll sites. This workflow file would allow us greater customization, including adding if: github.repository == 'hackforla/website' to deploy the HfLA website only on the main branch.

I have a test branch but I haven't yet fully tested switching to the custom configuration. Any thoughts on how appropriate this approach is?

roslynwythe commented 4 months ago

@tony1ee @elliot-d-kim @t-will-gillis I like the "Delete GitHub Pages site by changing the source" option best. Are there any disadvantages to that approach?

tony1ee commented 4 months ago

@roslynwythe It is only a very temporary solution, as I tested just now, after changing the source to "None", a sync with upstream will cause the workflow to run again and therefore website re-published.

(My apologies for wrongly stating "This should prevent the website from being re-published." for this method as mine repo was not behind upstream at the time of writing to do further testing)

main comment has been updated to reflect this discovery.

Might be useful for instructing inactive developers who will not sync with upstream again. But for active developers who still need to sync with upstream, this won't solve it, and a longer-term solution (looks like we need to tackle it from the GHA angle) is needed.

roslynwythe commented 4 months ago

Thank you @tony1ee for testing the proposal to set "default branch" to None - It is surprising, but I had the same results in my fork: after a sync with upstream, the site is re-published! @elliot-d-kim the approach you described above in https://github.com/hackforla/website/issues/6689#issuecomment-2067732813 is starting to look like the best approach. @t-will-gillis do you concur?