Closed micahstubbs closed 6 years ago
ran coffee validate-users.coffee
and got rate limited at user felixsch
, index 5004
in users-combined.csv
I can check my github api rate limit status with curl -XGET https://api.github.com/rate_limit
sure enough, it's out:
{
"resources": {
"core": {
"limit": 60,
"remaining": 0,
"reset": 1503216965
},
"search": {
"limit": 10,
"remaining": 10,
"reset": 1503214140
},
"graphql": {
"limit": 0,
"remaining": 0,
"reset": 1503217680
}
},
"rate": {
"limit": 60,
"remaining": 0,
"reset": 1503216965
}
}
ok, github rate-limit reset overnight. doing a new pull, with new and improved validate.coffee
that let's us specify a startIndex
and stops when we have reached our github rate-limit
coffee validate-users.coffee '' 5004
and again
coffee validate-users.coffee '' 10004
and once more
coffee validate-users.coffee '' 15008
using these regexes to find spaces
\s(?=\d,)
\s(?=\d\d,)
\s(?=\d\d\d,)
\s(?=\d\d\d\d,)
so we can and replace them with commas
using this and another find and replace operation in sublime text to convert
validate-coffee-partial-results.csv
to gist-counts-by-user.csv
this is a hack that I'll replace with a proper script soon 😅
I keep x-ratelimit-remaining
around as a column since it might be interesting to look at later, even though it's an artifact of our github API calling user-validation process and not directly related to the source data.
ok, progress! we have have identified 7316
users
now we'll run
sh index-new-users.sh
to generate updated metadata that contains gists from these newly discovered users
ok, that didn't work, because it depends on new-usables.csv
, and we haven't properly updated that yet.
the next thing to do is:
usables.csv
and the last version of usables.csv
from the previous git commit to generate new-usables.csv
done
skipped 0 missing files
wrote 10446 API blocks
wrote 11523 Color blocks
wrote 96749 Files blocks
wrote 24900 total blocks
➜ blockbuilder-search-index git:(index-new-users) ✗
ok, so before we knew about 24122
blocks. now after parsing through all the repo names on github for d3
, we know about 24900
blocks. It looks like this research project netted us 778
new blocks
doing multiple things at once, will move into smaller PRs 😅