looker-open-source / actions

MIT License
58 stars 102 forks source link

Errors with google sheets and drive action #468

Open simonuid opened 2 years ago

simonuid commented 2 years ago

We have deployed the action hub on our own infrastructure, so that we would be able to limit exports to our own Google drive.

While it is working, a lot of users are reporting failed schedules. Some of these have a mysterious "403 forbidden" error. Others have "The action could not be executed" or "undefined method parameters for for nil:NilClass."

We've spoken to Looker support about this (ticket) but they have been unable to reproduce the issues.

In some instances removing custom fields/calculations seems to resolve the problem, and sometimes deleting and recreating the schedule makes it work again. We don't think there's a problem with the configuration for the application in the GCP console as we'd expect this to be more obvious and causing problems with all or a large amount of jobs. We're not sure how to even go about debugging this, the logging doesn't show anything useful.

Does anyone have any ideas about this or a suggestion about how to diagnose or debug this properly? It's really frustrating as we'd hoped to replace our own existing 'Looker to gSheets' tool with this but we can't decommission that until we're sure that the action hub alternative is working properly. We can't even be sure if the problems are solely to do with our action hub deployment or related to other issues we're had around load on our Looker instance, particularly acute on Monday mornings when we have the vast majority of our schedules running as well as interactive users starting their week with Looker.

Thanks.

simonuid commented 2 years ago

Just wanted to update this as Jonathan Reinhard and others at Looker managed to work out the cause and solution of the "403 Forbidden" errors.

It turns out that our deployment of the Action Hub was being served via Cloudflare, and its WAF was blocking certain requests due to the existence of string like concat(orders.id, orders.id) and causing a 403 there. This meant that Looker was relaying the error to users, but the Action Hub code never saw the requests, hence us not seeing any information for these requests in the Action Hub logs.

The solution was add the IPs from which our Looker makes requests to the Action Hub to the WAF allowlist.

We still seem to have some spurious errors in our Action Hub deployment but we have at least mitigated this particular problem - hopefully this will be useful to someone else!