Open mariusrugan opened 9 years ago
Hi Marius,
We will review and provide feedback to you.
Sharad
Hi Marius,
Looks OK! A few suggestions - why not make RandomTrafficGenerator an InboundChannel instead of a Processor. A processor expects to receive events, processes and and sends event to its sinks. The traffic generator is a mock channel impl.
Can you please complete the pipeline flow control impl by implementing resume() method by setting m_runFlag to true. This will make the impl complete.
Can you please make sure that you have built the the docker images and tested the pipeline locally. It will take some more time for others to also review and provide comments. If we all agree to bring it in you will have to submit a pull request. Have you signed the contributor agreement?
Hi Sharad, thanks for replying, here are my comments:
Having motivated my http affinity :) I would happily participate in a parallel implementation in the demo module for traffic as 'InboundChannel' as you suggest. I must admit i didn't dig into 'replay' module a lot, but it seems what you referring to is PoC there. Please correct me if i'm wrong.
Thanks M.
I completely agree with where you are going with http. We use it in our own pipeline in production and it can ingest large volumes with our implementation. We require the connection to be persistent and we recommend batching with compression for high throughput. Another option is to create a websocket channel which we have discussed internally.
As I reflect more about your impl pipeline flow control might not make sense as you are writing directly to a HTTP connection. However If your RandomTrafficGenerator focused mostly with the generation of traffic and not couple yourself with HttpClient then one option you might consider is wiring your traffic generator to any outboundChannel. In that case flow control makes sense.
BTW we already have outboundRESTChannel which can be provisioned to send the event to a set of HTTP endpoints. We have not documented this in our Jetstream wiki so its not visible to you. Check out "com.ebay.jetstream.event.channel.messaging.http.outbound.OutboundRESTChannel".
Its wiring looks as follows:
The contributor agreement can be found here - https://github.com/pulsarIO/realtime-analytics/wiki/How-to-contribute
Sorry, forgot to specify wiring as XML - here it is.
<bean id="outboundRESTChanneBinder" class="com.ebay.jetstream.event.support.channel.ChannelBinding">
<property name="channel" ref="outboundRESTChannel" />
</bean>
<bean id="outboundRESTChannel" class="com.ebay.jetstream.event.channel.messaging.http.outbound.OutboundRESTChannel">
<property name="address" ref="outboundRESTChannelAddress" />
<property name="config" ref="HttpClientConfig" />
</bean>
<bean id="outboundRESTChannelAddress" class="com.ebay.jetstream.event.channel.messaging.http.outbound.RESTChannelAddress">
<property name="urlList">
<list>
<value>http://127.0.0.1:15999/RTBDEventConsumerServlet</value> </list>
</property>
</bean>
<bean id = "HttpClientConfig" class="com.ebay.jetstream.http.netty.client.HttpClientConfig">
</bean>
The idea is good. Move out traffic generator will make collector more clean. Previously the simulator use commons-httpclient to send the http traffic, it is ok for the simulator. If change to use the Outbound Rest Channel, it have to deserialize the String and convert it to JetstreamEvent. I think it is ok to keep using commons-httpclient.
One reason why we put the simulator inside the collector is it can use localhost as the server ip and easy to control for demo purpose. Move it out should make the URL can be configurable and also can be override via system property.
And once it move out, we need create a new docker for this. Will trafficsimulator a better module name?
I saw you add a new dependency:
What did it use for?
Hi Xinglang,
thanks for replying, your suggestions are valid, will add (runtime) config capability via properties for receiving endpoint.
as i commented above docker config will be cleaned and tested also.
related to the package name i suggest then: *.processor.httpTrafficSimulator
lombok: i took out boilerplate code , eg getters, setters in the class. i also used @scheduled from spring btw. (with dependency via jetstream dependencies)
On Tuesday, March 17, 2015, Xinglang Wang notifications@github.com wrote:
The idea is good. Move out traffic generator will make collector more clean. Previously the simulator use commons-httpclient to send the http traffic, it is ok for the simulator. If change to use the Outbound Rest Channel, it have to deserialize the String and convert it to JetstreamEvent. I think it is ok to keep using commons-httpclient.
One reason why we put the simulator inside the collector is it can use localhost as the server ip and easy to control for demo purpose. Move it out should make the URL can be configurable and also can be override via system property.
And once it move out, we need create a new docker for this. Will trafficsimulator a better module name?
I saw you add a new dependency:
org.projectlombok lombok ${lombok.version} compile
What did it use for?
— Reply to this email directly or view it on GitHub https://github.com/pulsarIO/realtime-analytics/issues/7#issuecomment-82091360 .
Hi,
i've started cleaning up the collector & moved traffic generator to a (sub) demo/ project. I also improved a bit traffic generator to generate single events rather than by default batch.
would be helpful if someone takes a look and gives some feedback (and builds also :) ) for: https://github.com/MariusRugan/realtime-analytics/tree/develop
work is not 100% done yet & i'd like to understand if this would be helpful to the master project?
thanks