Closed adam-singer closed 10 years ago
Useful for finding all failed packages
way to query for all build package info files
gsutil ls gs://www.dartdocs.org/**/package_build_info.json
After ingesting all the package build files a simple query can be made with big query
SELECT name, version, isBuilt FROM [test_dummy_data_set.my_table]
WHERE isBuilt = false LIMIT 1000
Simple idea to script getting all the json files into a single blob for big query. Would of been nice if it was possible to script this directly into big query.
f=$(gsutil ls gs://www.dartdocs.org/**/package_build_info.json)
for e in $f; do echo $(gsutil cat $e)>> /tmp/all.json; done
Then import all.json directly into bigquery. Bigquery schema would be name:STRING,version:STRING,isBuilt:BOOLEAN,datetime:TIMESTAMP
Commandline example of loading data into bigquery
bq load --source_format=NEWLINE_DELIMITED_JSON test_dummy_data_set.my_table /tmp/all.json name:STRING,version:STRING,isBuilt:BOOLEAN,datetime:TIMESTAMP
We can also use bigquery to count the failed builds
bq query "SELECT COUNT(isBuilt) AS failedCount FROM [test_dummy_data_set.my_table] WHERE isBuilt = false"
bq rm -f dart-carte-du-jour:test_dummy_data_set.my_table
Still using this issue as a scratch pad for notes. Ignore these comments
cat fetchData.sh
rm -rf /tmp/all.json
bq rm -f dart-carte-du-jour:test_dummy_data_set.my_table
F=$(gsutil ls gs://www.dartdocs.org/**/package_build_info.json)
for e in $F; do echo $(gsutil cat $e)>> /tmp/all.json; done
bq mk dart-carte-du-jour:test_dummy_data_set.my_table
bq load --source_format=NEWLINE_DELIMITED_JSON dart-carte-du-jour:test_dummy_data_set.my_table /tmp/all.json name:STRING,version:STRING,isBuilt:BOOLEAN,datetime:TIMESTAMP
superseded by datastore
https://developers.google.com/bigquery/loading-data-into-bigquery