Ability for Fabric batches to mimic ADI Event Driven Execution... remain indenfinately in "GENERATE_IID" stage until triggered to complete by external event

JohnH-K2View commented 3 weeks ago

We have a use case where we want to process a broadway flow on instnace IDs from a broadway topic. The publish to the topic will occur over a long period of time (over 1 hour) and we'd like to start processing as soon as the initial instances ar ein the topic.

In ADI we had event driven execution, where workers would remain availab le to process instances which become available until the execution is ended.

In Fabric, a similar functionality exists while the batch is in GENERATE_IID phase if the flag is set to begin processing (GENERATE_ENTTIES_FIRST=false).

But in Fabric we can't use a kafka topic as a source db (right?) and starting a batch with sql query we don't have a way for the query to remain open while some other thread inserts to the batchprocess_entities_info table.

Is it possible to keep a batch in the GENERATE_IID phase and then trigger the completion of that? Or another way to have a batch run indefinately while new instances are provided?

Current approach is to have a broadway flow subscribe to the topic and start the desired fabric_command as an inner flow with async (and set number of threads similar to max_workers_per_node); however the downside of this is the single node is the only one participating on entities it's consumed form Kafka, resources on other nodes may be available but can't redistribute the load the way a batch can. Also if a fabric node crashes during inner flow async there's no record of status.

natalylic commented 2 weeks ago

This is correct that only one node is consuming the Kafka topic, however the BATCH command distributes the work to other nodes of the defined AFFINITY (which is set on the BATCH command). If a node crashes, another node is supposed to pick up this BATCH and continue it.

natalylic commented 2 weeks ago

Please provide more details regarding your current implementation, so that we will fully understand your request and be able to answer it.

JohnH-K2View commented 2 weeks ago

Use case is the desire to run a batch (bill cycle daily processing) based on a stream interface (Kafka). Another application (bill calc) will generate the list of IIDs and insert them into Kafka; but this will occur over a long period of time (mulitple hours). We want to track all of these IIDs together as one batch (the bill cycle run for that date), but we don't want to wait for the full list of IIDs to be published to Kafka & then inserted to a staging DB before starting processing as that will result in a delayed finish time.

We cannot start processing an entity before the upstream application completes its processing and publishes the IID to Kafka, so we need to react to this publish event even if we already know the eventual full list of IIDs.

We can subscribe to the topic in a broadway flow (or rather: a set of flows running on each node in the cluster, as the volume will require multiple threads to process within the desired timeframe). As we subscribe and receive IIDs in broadway, we don't have an option to add the IIDs to an already running batch. We can start the desired broadway flow (e.g. via inner flow async); but then we lose all of the tracking of the batch, and we'll have to manually coordinate some features batch handles like max_workers_per_node, or balancing load across the cluster based on available capacity, reporting (batch monitor web page) on what IIDs in the batch have succeeded / failed, what is currently in progress, etc. As well we risk that if a node goes down, the IIDs in-process can't easily be picked up by another nod ein the cluster as we can with batch. If we could insert the IIDs to batchprocess_entities_info table and have a batch still pick up the new IIDs (meaning the batch can't finish when it's reached the total number of entities, as more might still publish; we'd need a manual way to notify the batch of the total number of entities).

So, we'd like to use a batch, but we want to start that batch and continue to add IIDs to it for a prolonged period of time. We don't want to wait until we have an SQL statement which will return the full set of IIDs, as then we can't start processing the first IIDs until we've recieved the full list of IIDs.

I'm not aware of any way to keep an SQL query result open for a long period and continue to return results as they update in an interface. But that may be an alternate if there is an interface type with that feature.

Can we create the entry in batchprocess_list table, then perform inserts to batchprocess_entities_info, then finally update our entry in batchprocess_list once the input stream is complete (essentially mimic the batch coordinator thread)?

k2viewLTD commented 2 weeks ago

John, I think the requirements are clear and make sense,

Guys from R&D, Please involve me in the discussion for solution, I can share how we did it with ADI and help brainstorm on the best approach from performance but also tolerance/resiliency to issues while the batch is running.

Get Outlook for iOShttps://aka.ms/o0ukef

From: JohnH-K2View @.> Sent: Thursday, June 13, 2024 4:19:56 PM To: k2view-academy/K2View-Academy @.> Cc: Subscribed @.***> Subject: Re: [k2view-academy/K2View-Academy] Ability for Fabric batches to mimic ADI Event Driven Execution... remain indenfinately in "GENERATE_IID" stage until triggered to complete by external event (Issue #1077)

Use case is the desire to run a batch (bill cycle daily processing) based on a stream interface (Kafka). Another application (bill calc) will generate the list of IIDs and insert them into Kafka; but this will occur over a long period of time (mulitple hours). We want to track all of these IIDs together as one batch (the bill cycle run for that date), but we don't want to wait for the full list of IIDs to be published to Kafka & then inserted to a staging DB before starting processing as that will result in a delayed finish time.

We cannot start processing an entity before the upstream application completes it's processing and publishes the IID to Kafka, so we nee dto react to this publish event even if we know the IIDs.

We can subscribe to the topic in a broadway flow (or a set of them, as the volume will require multiple threads to process within the desired timeframe). As we subscribe and recieve IIDs in broadway, we don't have an option to add the IIDs to an already running batch. We can start the desired broadway flow (e.g. via inner flow async); but then there's none of the tracking of the batch, and we'll have to manually coordinate some features batch handles; like max_workers_per_node, or balancing load across the cluster based on available capacity, reporting (batch monitor web page) on what IIDs in the batch have succeeded / failed, what is currently in progress, etc.

So, we'd like to use a batch, but we want to start that batch and continue to add IIDs to it for a prolonged period of time. We don't want to wait until we have an SQL statement which will return the full set of IIDs, as then we can't start processing the first IIDs until we've recieved the full list of IIDs.

I'm not aware of any way to keep an SQL query result open for a long period and continue to return results as they update in an interface. But that may be an alternate if there is an interface type with that feature.

— Reply to this email directly, view it on GitHubhttps://github.com/k2view-academy/K2View-Academy/issues/1077#issuecomment-2165653531, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADMJN7HWZHQXK76ITREAMNDZHGL7ZAVCNFSM6AAAAABJEWCYOKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRVGY2TGNJTGE. You are receiving this because you are subscribed to this thread.Message ID: @.***>

CAUTION: This email originated from outside of K2view *

natalylic commented 2 weeks ago

As we discussed, we will continue handling this query as a CR, following the ticket you have opened in the freshdesk.

k2view-academy / K2View-Academy

Ability for Fabric batches to mimic ADI Event Driven Execution... remain indenfinately in "GENERATE_IID" stage until triggered to complete by external event #1077