Azure-Samples / streaming-at-scale

How to implement a streaming at scale solution in Azure
MIT License
233 stars 98 forks source link

Add deduplication in eventhubs-streamanalytics-azuresql #44

Open algattik opened 5 years ago

algattik commented 5 years ago

In the solution, generation of duplicate events in locust is disabled (https://github.com/Azure-Samples/streaming-at-scale/commit/23bd9d16fd5da9b3adffdee02103fc77f0e5231d#diff-82da803b8f58c37426d86bd254ace7e4R222) for stream analytics as no mechanism exists to deduplicate. Even now, in case the ASA infra fails and is restarted, duplicates could cause the job to fail since there is a PK on EventId in Azure SQL.