Open etiennedi opened 5 years ago
Regarding the second config, spark-cassandra.properties
, this is a janus configuration. Should this override the default janus config we use, or is it additional configuration (i.e. can be pass both in?)
It is technically a janus configuration, but it's only consumed by the analytics app. We are using the analytics app to talk to the backends (Cassandra, ES) with Janusgraph libraries (that's one of the reasons why it has to be written in Java).
The only place I can find it currently being used is to be mounted in the Janus image within the compose file: https://github.com/semi-technologies/janus-spark-analytics/blob/70a676c5d1f8412de9203b47bfeefdd2e99a70d0/docker-compose.yml#L29
Is this still to be added to the analytics piece?
Good point, let me check. It is definitely used by the analytics app. Maybe I'm just copying it while building the Dockerfile and it works since there's only one version. I'll get back to you.
Yeah, I think that's the case, it happens to be part of the docker image, that's why it's working. The app loads it from the file system, as configured here: https://github.com/semi-technologies/janus-spark-analytics/blob/70a676c5d1f8412de9203b47bfeefdd2e99a70d0/analytics.yml#L8
By the way, the janus container in the docker-compose file (of the analytics app) is only used for testing. I need that to insert data into the graph, that the analytics app can then get out during the test run.
Ok, I missed that, thanks. I'll add it in.
@etiennedi how will this be used within the cluster? I guess it has an API, so I'll need to expose this with a service, is there an expected host/port which we want to run it on? (Ideally we would use port 80)
Correct, it has an http API, the port can be configured here: https://github.com/semi-technologies/janus-spark-analytics/blob/70a676c5d1f8412de9203b47bfeefdd2e99a70d0/analytics.yml#L4-L6
Port 80 should be fine (unless this Java stack somehow has an issue with priviliged ports?).
Ok good. Propose to name the service janus-spark-analytics
to be explicit, this would become the domain name for anyone wanting to connect to it.... I could also use analytics
as this is what the test app seems to use. Any preferance @etiennedi ?
I think janus-spark-analytics
is better, in case we want to introduce other analytics backends in the future.
we're pushing docker images as part of https://github.com/SeMI-network/janus-spark-analytics/issues/3
We want to have them included in the chart if a
spark
flag is set somewhere.The app depends on two configuration files, which should be mounted from a config map:
The app is mostly stateless, a simple
Deployment
should be sufficient. It'll receive very low traffic, so a single replica should be fine.