Closed andreeleuterio closed 3 years ago
Hey @andreeleuterio - can you post a few more pieces of information here to help replicate? What I need is ..
deploy.sh
, interpolated with the actual environment variables you usedhttps://console.cloud.google.com/storage/browser/_details/<BUCKET_NAME>
For example: @shagamemnon thank for the quick reply. Here's the information:
These variables don't contain sensitive information so I'm good to post them all here:
SCHEMA="schema-http.json"
# The name of the subdirectory in your bucket used for Cloudflare Logpush logs,
# for example, "logs/". If there is no subdirectory, use "/"
DIRECTORY="/"
BUCKET_NAME="cloudflare-sourcegraph-dot-com-logs"
DATASET="cloudflare_logstream"
TABLE="cloudflare_logs"
REGION="us-central1"
# You probably don't need to change these values:
FN_NAME="cf-logs-to-bigquery"
TOPIC_NAME="every_minute"
which leads to the following commands:
gcloud pubsub topics create every_minute
gcloud scheduler jobs create pubsub cf_logs_cron --schedule="* * * * *" --topic=every_minute --message-body="60 seconds passed"
gcloud functions deploy cf-logs-to-bigquery \
--runtime nodejs12 \
--trigger-topic every_minute \
--region=us-central1 \
--memory=1024MB \
--entry-point=runLoadJob \
--set-env-vars DATASET=cloudflare_logstream,TABLE=cloudflare_logs,SCHEMA=schema-http.json,BUCKET_NAME=cloudflare-sourcegraph-dot-com-logs,DIRECTORY=/
https://storage.cloud.google.com/cloudflare-sourcegraph-dot-com-logs/20211118/20211118T000002Z_20211118T000032Z_1ba9cfac.log.gz
@andreeleuterio @shagamemnon
I'm thinking it's a timestamp issue. The node function is outputting a prefix datetime that is 12 hours behind in my usage. So nothing in bucket.getFiles
will be appear for 12 hours.
I've solved this problem by swapping: loadJobDeadline.toFormat(`yyyyMMdd'T'hhmm`) for loadJobDeadline.toFormat(`yyyyMMdd'T'HHmm`)
@goaaron good catch. Thanks for pointing this out -- this is a change we'll need to make in master :)
Hi folks, I tried to deploy the dashboard following these instructions. All executions of the cloud function run successfully but give a "No new logs" output. I don't see the configured dataset in BigQuery leading me to believe it wasn't created because the function didn't find new logs.
I can see logs coming in to the bucket. I configured the
DIRECTORY
env var to/
because the date folders are in the root of the bucket. I also see the pub sub topic and cron jobs. Any ideas on how to further debug this?cc @mohammadualam