I'm trying to solve the following problem with pyleus: I have a processor bolt that processes data (wow). Depending on options in some config file, this processing bolt should emit data on different streams. I have multiple processing bolt in my topology, all configured in a different way.
Essentially, the processing bolt matches several queries against input streams in different place of the topology. The queries are user-specified. Each match of a query should be emitted on its on stream (such that downstream components only get those matches for which they subscribed).
Problem: The definition of the output_fields is static and for all instances of the processor bolt the same. This would not be a problem if I could either specify the output_fields during runtime, once the processor parsed its configuration. Or put the output stream configuration into the pyleus_topology.yaml . Both is not possible. I wonder if you have an idea how to tackle this problem.
Long story short: Is there a possibility to set the output fields of a component more flexible? Preferably I would like to set output fields in the pyleus_topology.yaml on a 'per component' basis.
A workaround may be, to define a number of dummy output_fields in the processor bolt, and use these to communicate a varying number of query matches.
Another workaround: Each downstream component gets all matches and has to filter for the interesting ones. Of course, this produce unnecessay communication…
Right now I have my own config file and use a script to create the input file for pyleus.
Hi,
I'm trying to solve the following problem with pyleus: I have a processor bolt that processes data (wow). Depending on options in some config file, this processing bolt should emit data on different streams. I have multiple processing bolt in my topology, all configured in a different way. Essentially, the processing bolt matches several queries against input streams in different place of the topology. The queries are user-specified. Each match of a query should be emitted on its on stream (such that downstream components only get those matches for which they subscribed).
Problem: The definition of the output_fields is static and for all instances of the processor bolt the same. This would not be a problem if I could either specify the output_fields during runtime, once the processor parsed its configuration. Or put the output stream configuration into the pyleus_topology.yaml . Both is not possible. I wonder if you have an idea how to tackle this problem.
Long story short: Is there a possibility to set the output fields of a component more flexible? Preferably I would like to set output fields in the pyleus_topology.yaml on a 'per component' basis.
A workaround may be, to define a number of dummy output_fields in the processor bolt, and use these to communicate a varying number of query matches.
Another workaround: Each downstream component gets all matches and has to filter for the interesting ones. Of course, this produce unnecessay communication…
Right now I have my own config file and use a script to create the input file for pyleus.