pantheon-systems / localdev-issues

Issue tracking for Pantheon localdev
6 stars 2 forks source link

localdev failing to sync. Process fails between 4-6% #116

Open hckia opened 2 years ago

hckia commented 2 years ago

Found two cases where accounts were able to authenticate, but not sync, with the process always failing between 4-6%

Screen Shot 2021-10-25 at 1 47 10 PM

Screen Shot 2021-10-25 at 1 47 48 PM

what has been attempted or ruled out

  1. Reinstalling localdev and docker
  2. Trying a token from an account with a single site (success)
  3. Trying a token from an account with almost 2k more (success after reattempting)

What has not been ruled out

  1. Existence of orphaned sites (sites frozen too long that we cannot or did not get restored, or show up via terminus, but not dashboard, or vice-versa.

steps to reproduce

Since we don't know what the cause is, the only recommendation I can give is to add all the same sites I'm on, or I can provide a token, but the process with my token would be this...

  1. load localdev
  2. walk through step 1 (turning off error logging will actually make the sync process go "farther")
  3. use my token
  4. sync sites and wait for it to fail
nks04747 commented 2 years ago

I'm having the same issue and have a somewhat simpler example. I was using Localdev with no issues, but started getting a random error when starting up Localdev (don't remember the exact error, not descriptive at all). I eventually noticed Localdev was no longer syncing sites I'd added or removed. I should mention that I could push pull no problem, it just wouldn't sync. After uninstalling and reinstalling Localdev, I can no longer use Localdev with my token due to never getting past "Sync your account with Localdev".

I finally tried a coworker's token and I get right in. I have two coworkers and either of their tokens work no problem on both my machines that my tokens won't work on. We've tried tokens from my account on their machines and they fails to sync with Localdev. The only difference in our accounts seems to be that I have a personal old Drupal install in Pantheon that shows as frozen, but if I click on it it just sends me back to my dashboard (so no way to unfreeze or delete it). Pantheon support says it shows as already deleted, but my dashboard just sees it as frozen. I'm pretty sure this site that is "there but it isn't" is what is causing Localdev to freak out when it tries to sync my sites.

I've tried countless reinstalls of Localdev, Localdev Edge, Docker, etc...but the fact that my tokens don't work on numerous machines, but my coworkers tokens work on those same machines/installs, shows that it has to do with Localdev tripping on something in my account when it syncs.

As hckia said, about the only way to let you reproduce this is to give you my machine key.

netw3rker commented 2 years ago

I just ran into the same problem and this was really helpful in figuring out what was going on. It turned out I had some "orphan sites" that were in the way. What was interesting for me was that they didn't really show up in terminus. Here's what I did: 1) run terminus site:list -vvv This will give you the data from the service endpoint that terminus gets the site list from. It will be in json format & you'll need a json viewer/editor for the next step. It will look something like this:

Headers: {"Content-type":"application/json","User-Agent":"Terminus/2.5.0 (php_version=7.4.20&script=/Users/pantheonuser/terminus/terminus)","Authorization":"**HIDDEN**"}
URI: https://terminus.pantheon.io:443/api/users/e5409ee7-dbc3-4b4a-880f-r3d4cted/memberships/sites?limit=100
Method: GET
Body: null
 [debug] #### RESPONSE ####
Headers: {"server":["nginx"],"date":["Mon, 08 Nov 2021 17:11:42 GMT"],"content-type":["application\/json; charset=utf-8"],"transfer-encoding":["chunked"],"connection":["keep-alive"],"x-pantheon-trace-id":["eeac7d70-40b6-11ec-8c26-r3d4cted"],"x-frame-options":["deny"],"set-cookie":["_csrf=GPC2uZsIVpzcM-4WDyr3d4cted; Path=\/; HttpOnly; Secure"],"access-control-allow-methods":["GET"],"access-control-allow-headers":["Origin, Content-Type, Accept"],"cache-control":["private, max-age=0, no-cache, no-store"],"pragma":["no-cache"],"vary":["Accept-Encoding"],"strict-transport-security":["max-age=31536000"]}
Data: [{"archived":false,"invited_by_id":"e5409ee7-dbc3-4b4a-880f-r3d4cted","role":"team_member","id":"a8e71640-b97a-47f1-84be-r3d4cted","key":"e5409ee7-dbc3-4b4a-880f-r3d4cted","site_id":"a8e71640-b97a-47f1-r3d4cted","user_id":"e5409ee7-dbc3-4b4a-880f-r3d4cted","site":{"user_in_charge_id":"e5409ee7-dbc3-4b4a-880f-r3d4cted","product":{"id":"5758ea6a-bfda-4d56-bc0c-r3d4cted","longname":"WordPress"},"service_level":"free","user_in_charge":{"profile":{"marketing_email_consent_implied":false,"experiments":{"welcome_video":"shown"},"full_name":"Pantheon User","pullFromLive":true,"initial_identity_strategy":null,"web_services_business":null,"verify":"7a56373dee645d139ba75fr3d4cted","state":null,"registration_context":"new_developer","opt_in_new_dashboard":false,"job_function":"","firstname":"Pantheon","lastname":"User","pda_campaign":"from-https:\/\/pantheon.io\/"
...

2) Load the json data that is in the Data: line into a json reader. 3) review the list of sites that come back for anything out of the ordinary. Ideally there should be a 1:1 relationship between the root values in the data, and what you see on your user dashboard, and what you see in the final results of the terminus site list. However it's likely you'll have different numbers. For me, I found 5 keys that looked like this:

  {
    "archived": false,
    "invited_by_id": "e5409ee7-dbc3-4b4a-[redacted]",
    "role": "team_member",
    "id": "dd8de000-3deb-[redacted]",
    "key": "e5409ee7-dbc3-[redacted]",
    "site_id": "dd8de000-3deb-44ad-[redacted]",
    "user_id": "e5409ee7-dbc3-4b4a-[redacted]"
  },

4) make a list of sites with 'site_id' values that don't appear in your dashboard as sites (check the URL's), and contact support to request to have them 'de-oprhaned', and optionally, deleted via terminus (they cannot be accessed or deleted in any other way). 5) if support won't delete them, just run terminus site:delete [site_id] after they are deorphaned. It will throw a few errors, but it will delete them.

Once you run that, you should be able to get localdev to sync.

hckia commented 2 years ago

in @nks04747 's instance, the site couldn't be deleted because it was part of a defunt org. Support was able to remove the site by adding it to an existing internal org, then deleting the site. Still trying to confirm this with two other cases that seem to have multiple problematic sites.