Closed agrajm closed 4 years ago
Hello, nice picture. Unfortunately out of the box Atlas support was dropped from Spline 0.4 and on. Also, we switch to Arango DB database in 0.4 so it won't work with Mongo DB anymore.
You can read more about atlas support here: #279
@cerveada let's forget Atlas as of now, I'm happy to replace Mongo by Arango DB -- Infact I've it installed on the same VM running the Spline REST Gateway & Spline UI for my use-case. My question remains How do I configure Gateway with authentication when pushing the lineage captured by Spline running on Azure Databricks -- I've already setup ArangoDB with AuthN but if we have some documentation on how to use ArangoDB as the persistence - like which spline properties to use.
Include the credentials in the databse url config spline.database.connectionUrl
the same way as for admin tool url. #666
It's described in the documentation, but without the authentication.
Background
I'm trying to achieve the following setup, trying to capture the data lineage from Spark Jobs running on Azure DataBricks using Spline, put the lineage in MongoDB (using spline-persistence-mongo) and then visualize using Spline UI. Please see the attached high level architecture diagram
Please note that
Questions
Need to setup the rest gateway in an authenticated mode so that communication b/w Databricks and Rest Gateway is secured - how to configure the
spline.producer.url
to include the authentication (username/password) or use some other configuration in spline to achieve that ?I've currently installed the following JARs on my cluster:
Do I need the
agent-core
jar also or just thespline-agent-bundle-jar
for the specific spark versions is enough ?My goal is to capture the lineage finally in Atlas but first trying with Mongo, and if this works then plan to capture this in Atlas instead..... I saw someplace that with Spline 0.4 we started supporting Atlas as persistence but yet to figure out the details.
Regards, Agraj