FusionAuth / fusionauth-issues

FusionAuth issue submission project
https://fusionauth.io
89 stars 12 forks source link

[Category] Webhooks #1543

Open robotdan opened 2 years ago

robotdan commented 2 years ago

General Webhook enhancements

Problem

As we look forward to enable FusionAuth as a SCIM client we want to enhance webhooks to have some additional capability that will make it easier to support SCIM.

Solution

This is a general issues to enhance webhooks. Some specific features we'd like to deliver:

  1. Persist a historical log of all webhooks. Options for view, resend, maybe a delete option to prune events?
  2. Allow a retry from a webhook event that was not successfully received
    • Resend / Retry a single event to one or more webhooks
    • (SCIM) Optionally send all events beginning at a specific sequence to reset or catch up from a point time to "now".
  3. Additional configuration for retry logic. Back off logic and then finally move to a failed state to allow for a future manual retry. Or optionally try forever.
  4. A webhook event should record which webhooks it was sent to, the status code of the webhook returned a response, etc. i.e. which ones succeeded and which ones failed in the event viewer , webhook event log.
  5. Possibly additional configuration for queuing, and ordering.
  6. Remove application join table, this is confusing and is not useful.
  7. ~Add a tenant join table, allow webhooks to be enabled for one-to-many tenants, or all~
  8. Error code mapping. For example, what is success for a particular webhook, currently must be 200.
    • Optionally map retry logic to particular status codes.
  9. Upsert options. For example, a Update event is sent and the webhook returns 404, we could convert this to a Create event. This is sort of a SCIM advanced feature.
  10. Allow the webhook to return a message that can be displayed to an end user or an API response.
  11. Better certificate management using Key Master
  12. Disable a webhook to temporarily stop events without deleting it or modifying any existing event configuration.
    • When "disabled" or "paused" config to say still consider for events, meaning instead of sending, just queue pending and wait for it to come back online or for paused to be moved to "play".
  13. Health checks for webhooks, health checks for the number of failed or pending events, queued, etc.
    • DropWizard health checks that show up in /api/status or /api/prometheus/metrics
  14. Add a config per webhook or per event for a max duration to wait for the webhook to respond.
  15. Possibly add a config to a webhook to indicate if this webhook should be considered in the TX configuration. For example, a single webhook is a "fire and forget" style (3rd party) and you don't want it to cause a TX failure, even though the other two Webhooks are "yours" and they are critical.
  16. (SCIM) Bulk Event sender options
  17. Review all current events to see if any more of them should be considered non-transactional. For example, should UserLoginSuccessEvent and UserLoginSuspiciousEvent be allowed to be configured as transactional? Or can we just fire and forget.
  18. Allow a lambda to be assigned to handle webhook events in addition to, or instead of a webhook.

Related

Completed

Community guidelines

All issues filed in this repository must abide by the FusionAuth community guidelines.

How to vote

Please give us a thumbs up or thumbs down as a reaction to help us prioritize this feature. Feel free to comment if you have a particular need or comment on how this feature should work.

hughevans commented 2 years ago

Add a tenant join table, allow webhooks to be enabled for one-to-many tenants

This would cover the use case we have where in our development instance we have multiple tenants and each tenant needs to have unique webhook URLs. Another solve would be for webhooks to belong to tenants.

glen-84 commented 2 years ago

Add a tenant join table, allow webhooks to be enabled for one-to-many tenants

It's really surprising to me that webhooks are not scoped to tenants (and/or applications). It means that creating a user (for example) will fire a hook across n tenants, resulting in:

  1. Exposure of user data across multiple tenants that should otherwise be isolated. (not an issue for us, but could affect others)
  2. Reduced performance having to wait for all hooks to succeed (we can't just wait for one). This is really bad if some tenants are staging environments with far less performance. This will affect us.
  3. Transaction failure if one of the hooks fails in an unrelated tenant. This means that a failing request sent to a staging URL would mean that a production user could not sign up (!). This will affect us.
  4. The requirement to filter by tenant or application in the webhook.
  5. A lot of confusion (as seen in #169).

Is this issue being prioritised yet? It seems quite serious. Creating another deployment just for a different environment can be costly.

(We're looking into the option of using a separate deployment, though this was an unexpected cost.)

robotdan commented 2 years ago

@robfusion can you open a separate GH issue that is linked to this one to track the tenant change we are making so that we can track it to close and leave this one open as the larger project task? The tenant work you are doing will essentially be delivering item 6 and 7 from the above list.

glen-84 commented 1 year ago

https://github.com/FusionAuth/fusionauth-issues/issues/1660 is listed twice.