Open jnioche opened 6 years ago
This one is really interesting. Will it be up to the user to correctly create the status core with the same number of shards as the parallelism: ***
used in the crawler.flux?
This one is really interesting. Will it be up to the user to correctly create the status core with the same number of shards as the
parallelism: ***
used in the crawler.flux?
yes. we should set it to a reasonable default value (10?) but then it is up to the user to manage it
@jnioche can we merge branch 851 into main at this point? This way we can update / add tests along with the new functionality.
Hasn't that been done in https://github.com/apache/incubator-stormcrawler/pull/1240? Is there more in that branch that hasn't been merged? If so, would you mind creating a PR from that branch? Thanks!
Hasn't that been done in #1240? Is there more in that branch that hasn't been merged? If so, would you mind creating a PR from that branch? Thanks!
The changes from #1240 were merged into branch apache:851 but not into main. Should I open an additional PR from apache:851 to main?
Hi @mvolikas - sorry about the delayed response, I have just returned from holidays
Should I open an additional PR from apache:851 to main?
yes please, that would be great
Just like it's done in ES, we could route the documents in the statusupdaterbolt based on the host / name or IP and in the spouts check that the number of instances is equal to the # of shards and filter the queries per shard accordingly.
At the moment, we can have only one instance of a spout.