Closed flyinprogrammer closed 3 years ago
@adriancole FYI, you might want to take a look at this since you have some context.
Let me have a look at the tests not running.
On Wed, 9 Dec 2020, 19:52 Buchi Reddy Busi Reddy, notifications@github.com wrote:
@adriancole https://github.com/adriancole FYI, you might want to take a look at this since you have some context.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/hypertrace/hypertrace-service/pull/51#issuecomment-741976985, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAXOYAWPYZJSLPWJKI4X2CLST7BQVANCNFSM4US5KV2A .
Let me have a look at the tests not running. … On Wed, 9 Dec 2020, 19:52 Buchi Reddy Busi Reddy, @.***> wrote: @adriancole https://github.com/adriancole FYI, you might want to take a look at this since you have some context. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#51 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAXOYAWPYZJSLPWJKI4X2CLST7BQVANCNFSM4US5KV2A .
CircleCI tests were failing due to failed authentication (unauthorized user PR) and now build and snyk-scan is failing due to https://github.com/hypertrace/hypertrace/issues/132 .
Leta merge this and add a check to avoid a regression.
On Thu, 10 Dec 2020, 13:07 Jayesh Bapu Ahire, notifications@github.com wrote:
@JBAhire approved this pull request.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/hypertrace/hypertrace-service/pull/51#pullrequestreview-549131622, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAXOYAR7FY5Q6KJBNZ3QQP3SUC2Z5ANCNFSM4US5KV2A .
There seems to be confusion here due to
For our docker-compose
setup,
a. we are using pinot's servicemanager
role (which runs all pinot's component in a single jvm). In this role as of now, it starts controller first than server and broker in parallel. Refer PR - https://github.com/apache/incubator-pinot/pull/5917/files - which is part of pinot 0.6.0 release
b. HypetraceService
is dependent on pinot
as ref https://github.com/hypertrace/hypertrace/blob/main/docker/docker-compose.yml#L27. However, instead of using the service_healthy
construct, we are using service_started
as pinot startup is heavy. On low configuration local envs, it may take more than 60secs which is a limitation (undocumented) for docker-compose
for dependent containers. So, if pinot
takes more than 60 secs
, it will not start HypetraceService
. So, we start them parallel, and checks for pinot
health internally in HypertraceService
.
c. controller health check doesn't tell that server
and broker
component has started successfully. So, we had requested
for a composite health check for all components running as part of servicemanager
here - https://github.com/apache/incubator-pinot/issues/5850,
https://github.com/apache/incubator-pinot/issues/5950
d. In servicemanager
role, as the controller starts first and server, broker components in parallel, we had noticed a few seconds delay in low config evns before they are successfully available. So, we have added checks for pinot-server
and pinot-broker
registration before checking for availability of trace/span in pinot` - https://github.com/hypertrace/hypertrace-service/blob/main/hypertrace-service/src/main/java/org/hypertrace/service/HypertraceUIServerTimerTask.java#L147
So, I would suggest adding aliases for all the three components - pinot-server, pinot-broker, and pinot-controller
in the docker-compose file to avoid naming confusion, and reverting the checks to pinot-server
for now till we have a single endpoint to check the health of all components (we may have to parse though).
Let me know if I am missing something.
cc: @flyinprogrammer @jcchavezs @laxmanchekka @buchi-busireddy
Good catch @kotharironak! I think this is why it is important to have a check on this. Would you revert it and fix the naming discrepancy?
On Mon, 14 Dec 2020, 06:38 kotharironak, notifications@github.com wrote:
There seems to be confusion here due to
alias network name used in docker-compose file for pinot's all-in-one deployment called the service manager
https://github.com/hypertrace/hypertrace/blob/main/docker/docker-compose.yml#L102
- usage of Hypetrarace Service
- It is only used in local deployment mode for a quick start, not in SaaS
For our docker-compose setup,
a. we are using pinot's servicemanager role (which runs all pinot's component in a single jvm). In this role as of now, it starts controller first than server and broker in parallel. Refer PR - https://github.com/apache/incubator-pinot/pull/5917/files - which is part of pinot 0.6.0 release
b. HypetraceService is dependent on pinot as ref https://github.com/hypertrace/hypertrace/blob/main/docker/docker-compose.yml#L27. However, instead of using the service_healthy construct, we are using service_started as pinot startup is heavy. On low configuration local envs, it may take more than 60secs which is a limitation (undocumented) for docker-compose for dependent containers. So, if pinot takes more than 60 secs, it will not start HypetraceService. So, we start them parallel, and checks for pinot health internally in HypertraceService.
c. controller health check doesn't tell that server and broker component has started successfully. So, we had requested for a composite health check for all components running as part of servicemanager here - apache/incubator-pinot#5850 https://github.com/apache/incubator-pinot/issues/5850, apache/incubator-pinot#5950 https://github.com/apache/incubator-pinot/issues/5950
d. In servicemanager role, as the controller starts first and server, broker components in parallel, we had noticed a few seconds delay in low config evns before they are successfully available. So, we have added checks for pinot-server and pinot-broker registration before checking for availability of trace/span in pinot` - https://github.com/hypertrace/hypertrace-service/blob/main/hypertrace-service/src/main/java/org/hypertrace/service/HypertraceUIServerTimerTask.java#L147
So, I would suggest adding aliases for all the three components - pinot-server, pinot-broker, and pinot-controller in the docker-compose file to avoid naming confusion, and reverting the checks to pinot-server for now till we have a single endpoint to check the health of all components (we may have to parse though).
Let me know if I am missing something.
cc: @flyinprogrammer https://github.com/flyinprogrammer @jcchavezs https://github.com/jcchavezs @laxmanchekka https://github.com/laxmanchekka @buchi-busireddy https://github.com/buchi-busireddy
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/hypertrace/hypertrace-service/pull/51#issuecomment-744181330, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAXOYARJUDTQUWZ3GPKVVODSUWQEHANCNFSM4US5KV2A .
Description
Pinot 0.6.0 controllers now have a health check: https://github.com/apache/incubator-pinot/pull/5846 So we should just use that instead of confusing people with asking about a single pinot server health check, and making it impossible to reference a controller running on a different host.
Checklist:
Documentation
As far as I can tell this configuration isn't referenced anywhere, but perhaps I am blind.
For reference a Pinot Server: https://docs.pinot.apache.org/developers/advanced/advanced-pinot-setup#pinot-server And Pinot Controller: https://docs.pinot.apache.org/developers/advanced/advanced-pinot-setup#pinot-controller Are two different components of this distributed system. This service only relies on the Controller
/brokers/tenants
API and yet it was using the Server's/health
API in the absence of a/health
endpoint on the controller itself.