google / gae-secure-scaffold-python3

Secure scaffold for Google App Engine static and dynamic Python websites
Apache License 2.0
31 stars 16 forks source link

Fix @securescaffold.cron_only #5

Open bgirschig opened 2 years ago

bgirschig commented 2 years ago

My cron job is currently failing because of @securescaffold.cron_only: Cloud Scheduler only sends the following headers (at least in my case):

X-Appengine-Default-Version-Hostname, X-Appengine-Request-Log-Id, X-Google-Internal-Skipadmincheck, X-Appengine-Api-Ticket, X-Appengine-User-Ip, X-Appengine-Https, X-Appengine-Timeout-Ms, Traceparent, X-Cloud-Trace-Context, X-Appengine-Country, X-Cloudscheduler-Jobname, X-Cloudscheduler, User-Agent, Forwarded, X-Forwarded-Proto, X-Forwarded-For, Host

I suggest we also allow requests with the "X-Cloudscheduler" header

davidwtbuxton commented 2 years ago

Interesting! I didn't know Cloud Scheduler was using new header names. Is it documented that these X-Cloudscheduler* headers can be trusted / cannot be spoofed?

bgirschig commented 2 years ago

I'm not quite sure where to find that documentation, but at least this doesn't work (returns a 403):

curl 'https://my-host/a-cron-job' --header 'X-Cloudscheduler: true'
davidwtbuxton commented 2 years ago

Hi Bastien, how is this scheduled task configured? Are you using gcloud app deploy cron.yaml or the API or the web Cloud Console UI? What kind of target is defined? I wonder if this cron task is pointing to your app as a generic HTTP end-point, instead of an App Engine end-point.

https://cloud.google.com/scheduler/docs/creating#creating_jobs https://cloud.google.com/appengine/docs/standard/python3/scheduling-jobs-with-cron-yaml#validating_cron_requests https://cloud.google.com/scheduler/docs/reference/rest/v1/projects.locations.jobs#AppEngineHttpTarget

davidwtbuxton commented 2 years ago

I did a test and it looks like an anonymous user can send a bogus X-Cloudscheduler header to an App Engine application. Which means we cannot trust that header to indicate the request was definitely made from Cloud Scheduler.

E.g. making a request to https://gae-secure-scaffold-python3.appspot.com/headers with a bogus X-Cloudscheduler header shows the header in the output (it was not stripped by Google).

curl --header 'X-Cloudscheduler: true' https://gae-secure-scaffold-python3.appspot.com/headers

I don't know if there's a different header that we can trust for Cloud Scheduler requests. Also I don't see a header with an identity token that can be used to verify the caller (I was testing with a cron task created using the scheduler UI in https://console.cloud.google.com/cloudscheduler, not using App Engine's older cron.yaml config).

@bgirschig would you check my test above and report your findings? Thanks.

bgirschig commented 2 years ago

Sorry for the late reply, I was on christmas break (plus got covid...)

To answer your first set of questions, I used the cloud scheduler UI to setup my job with the following settings:

For your test, you're right. The header is not stripped.

With these settings, I can see one header that seems safe (stripped by app engine): X-Google-Internal-Skipadmincheck But this would be a very hacky way of doing this.

I've added a PR for app engine to strip the "X-Cloudscheduler" header from external requests. We'll see it that works...

davidwtbuxton commented 2 years ago

In the meantime, use cron.yaml to define a schedule, and update the app as described in https://cloud.google.com/appengine/docs/standard/python3/scheduling-jobs-with-cron-yaml#uploading_cron_jobs

That way the trusted X-Appengine-Queuename header (and others) will be present when the cron scheduler hits your app.