Closed seanlinsley closed 10 years ago
Appears to be a Twitter issue: https://dev.twitter.com/discussions/25385.
Here's a script to convert si0
to pbs
if we need to go that route (take out the rollback):
BEGIN;
UPDATE elsewhere SET user_info=user_info || (
'profile_image_url_https'=>(
'https://pbs.' || substring(user_info->'profile_image_url_https' from 11)
)
) WHERE platform='twitter';
SELECT user_info->'profile_image_url_https' FROM elsewhere WHERE platform='twitter';
ROLLBACK;
END;
Still borken.
@whit537 do all URLs in the database use si0
currently?
@seanlinsley https://botbot.me/freenode/gittip/msg/10088765/
Is there any correlation between the 4431 users with a pbs
URL? Are they new?
Are there any other subdomains in use?
@seanlinsley I really don't know. But I know that there is more subdomains on twtimg (like a0).
I sugest to change @whit537 SQL and make a script that do this. (pseudo code)
#check image working
if get(profile_image_url_https).code in [403, 404]:
#check if changed link works
if get(updated_to_pbs_profile_image_url_https).code in [200, 301]
#if works then update URL
update_url()
Some of these are coming back, others aren't. MaxCDN and Bountysource are back, UkuleleRod isn't. Could be because the first two have logged in since this started.
Confirmed: MaxCDN and Bountysource are now on pbs
, while UkuleleRod is still on si0
. I checked a backup from last week and all three were on si0
last week.
What's the harm in switching everyone who is si0
to pbs
, per https://github.com/gittip/www.gittip.com/issues/1936#issuecomment-33078357? I suppose we're assuming that all si0
s are busted and all pbs
s are good. We could/should verify that assumption before pulling the trigger.
#!/usr/bin/env python
import requests, sys
for i, line in enumerate(open('twimg.csv')):
url = line.strip()
response = requests.get(url)
if response.status_code != 200:
print response.status_code, url
sys.stdout.flush()
I'm running that script against 18,960 URLs. Will report back ...
Just don't do it from production :D
:-)
[gittip] $ grep "403 " twimg.log | wc -l
13123
[gittip] $ grep "404 " twimg.log | wc -l
534
[gittip] $ grep "si0" twimg.log | wc -l
13123
[gittip] $ grep "pbs" twimg.log | wc -l
534
[gittip] $ wc -l twimg.log
13657 twimg.log
[gittip] $ echo 13123 534 + p | dc
13657
[gittip] $
The script died before reaching 18,960, not sure why. Also, why are the pbs
ones 404 instead of 200?
Blech. This sucks.
The right ways to fix this are:
Neither of those is trivial.
There's a script in #1989 to fix this as a one-off. Spinning up a DO VPS to run it (using the payday image) ...
The script died mysteriously (forgot to redirect stderr :/ ) after processing 4036 accounts. Before rerunning it's probably worth rewriting to use users/lookup (100 at a time) instead of users/show (one at a time), per https://github.com/gittip/www.gittip.com/pull/1989#issuecomment-34416629.
Rewrote the script to use lookup and rerunning it now. It still has a 5-second sleep between hits. If we were under 18,000 we could fit inside one 15 minute window, but we're at ~19,000.
This should be done in 15-20 minutes.
Done! :dancer:
This is currently being discussed in IRC.