netlify / cli

Netlify Command Line Interface
http://cli.netlify.com
MIT License
1.57k stars 346 forks source link

Potential race condition in templates feature #4212

Open charliegerard opened 2 years ago

charliegerard commented 2 years ago

Describe the bug

After the release of the templates command sites:create-template we ran into what seems to be a race condition issue that causes the 1st deploy of a new site to occasionally fail with the error Failed during stage 'preparing repo': git ref refs/heads/main does not exist.

It seems like even though the new repo is created on GitHub from a template before creating the new site on Netlify, when the deploy starts, it sometimes can't find the main branch?!

Re-deploying works perfectly fine but we were wondering if there would be a fix for this to ensure this issue doesn't happen. We considered adding some kind of setTimeout before calling the Netlify API to create a new site but it feels a little weird 😕 and it might not even ensure that it works if one day GitHub is just very slow.

We considered asking the team if we could implement some changes to re-try deploys automatically when this error happens but that would be outside of the scope of the CLI.

If you have any other ideas, please let us know!

To Reproduce

Steps to reproduce the behavior (it only occasionally happen so you might have to try a few times):

  1. Run netlify sites:create-template
  2. Pick the Gatsby or Hugo template
  3. Finish running the command
  4. Sometimes, the 1st deploy on Netlify will automatically fail with the error mentioned above.

Expected behavior

Creating a site from a template via the CLI should succeed on 1st deploy.

--

cc @maxcell & @tzmanics

erezrokah commented 2 years ago

Based on https://docs.github.com/en/repositories/creating-and-managing-repositories/creating-a-repository-from-a-template, creating a repo from template is similar to forking one.

We might need to poll until the repo exists, see https://github.com/netlify/netlify-cms/blob/f85960ecf6c125824642e2a664cd0ba2fa44aede/packages/netlify-cms-backend-github/src/implementation.tsx#L183

We already have a dependency that can abstract the polling for use, see https://github.com/netlify/cli/blob/865b699a961eebf2b6378b32096912f604ef654e/package.json#L283

nickytonline commented 2 years ago

~This has been on our board for a while. Just following up to see if this is still an issue @erezrokah?~ I see they've moved on to elsewhere. I'm going to follow up with DX and see if this is still an issue.

nickytonline commented 2 years ago

I don't get that error, but I consistently get this error when the cloning the repo step runs (default answer is yes).

❯ ntl sites:create-template

? Netlify CLI needs access to your GitHub account to configure Webhooks and Deploy Keys. What would
 you like to do? Authorize with GitHub through app.netlify.com
Choose one of our starter templates. Netlify will create a new repo for this template in your GitHub account.
? Template: gatsby-starter-netlify-cms
? Team: Nick Taylor's team
Choose a unique site name (e.g. super-cool-site-by-nickytonline.netlify.app) or leave it blank for a random name. You can update the site name later.
? Site name (optional): undefined

Site Created

Admin URL: https://app.netlify.com/sites/super-cool-site-by-nickytonline
URL:       https://super-cool-site-by-nickytonline.netlify.app
Site ID:   3b12daf1-e7d1-4761-b359-a9f8745231e0
Repo URL:  https://github.com/nickytonline/super-cool-site-by-nickytonline
? Do you want to clone the repository? Yes

 ›   Error: 'git clone' failed with status 128

@maxcell or @tzmanics, is the initial issue opened by Charlie still an issue? Not sure if either of you ever had a chance to look at this.

sarahetter commented 2 years ago

I'm seeing the race condition issue when I'm using my personal Netlify account, so I think the error above is a git permissions issue as mentioned in slack.

nickytonline commented 2 years ago

Thanks, @sarahetter. I'll test the scenario with a personal account, as the Git permissions issue is separate. In my case, I'm using my work Netlify account and using my GitHub, which I believe is what is causing the permissions issue.

tzmanics commented 2 years ago

Hmm when we received the error it seemed like it was bc Netlify was trying to deploy before the github repo was built. Also, it was seemingly random, we never knew when it would work or would give us that error. This does seem like the same race condition tho, bc it seems to be trying to grab a repo that doesn't exist yet.

I hate the status 128 tho bc it could mean so many things.

I haven't repeatedly tested this but in doing demos I haven't gotten this error since we first discovered it in Feb.

nickytonline commented 2 years ago

So I copied the generated command for cloning the repo and ran it outside of the CLI and this is the actual error for the generic 128 git error.

❯ git clone git://github.com/nickytonline/netlify-thinks-nick-taylor-is-great-ff3c4.git netlify-thinks-nick-taylor-is-great-ff3c4
Cloning into 'netlify-thinks-nick-taylor-is-great-ff3c4'...
fatal: remote error: 
  The unauthenticated git protocol on port 9418 is no longer supported.
Please see https://github.blog/2021-09-01-improving-git-protocol-security-github/ for more information.

The format for it to work is this.

git clone git@github.com:nickytonline/netlify-thinks-nick-taylor-is-great-ff3c4.git netlify-thinks-nick-taylor-is-great-ff3c4

I'm going to see about fixing that so we can work on the actual issue.

nickytonline commented 2 years ago

Alright, I was able to reproduce this. Going to look at getting a fix up.

8:17:52 AM: Build ready to start
8:17:54 AM: Creating deploy upload records
8:17:54 AM: Failed during stage 'preparing repo': git ref refs/heads/main does not exist
8:17:53 AM: build-image version: (focal)
8:17:53 AM: build-image tag: v4.8.0
8:17:53 AM: buildbot version:
8:17:54 AM: Fetching cached dependencies
8:17:54 AM: Failed to fetch cache, continuing with build
8:17:54 AM: Starting to prepare the repo for build
8:17:54 AM: git ref refs/heads/main does not exist or you do not have permission
8:17:54 AM: Failing build: Failed to prepare repo
8:17:55 AM: Finished processing build request in 1.190993481s
taty2010 commented 1 year ago

Hi @nickytonline just wanted to post an update on this. I ran into this same issue twice recently when deploying two different templates using the sites:create-template

Image

Wondering if this is something we can look into for the next iteration.

nickytonline commented 1 year ago

Yeah, it's still an issue. This is something I looked at when I started in April, but my solution was more of a bandaid solution. Unfortunately, this hasn't been prioritized at the moment and given that it's not framework specific, not sure if our team will pick this up again or if we do it probably won't be any time soon.