AbsaOSS / spline

Data Lineage Tracking And Visualization Solution
https://absaoss.github.io/spline/
Apache License 2.0
599 stars 155 forks source link

Spline UI Crashes : Type error 'e' is not defined #461

Closed devshank closed 4 years ago

devshank commented 4 years ago

I am trying to fetch lineage data of Spark transformations executed on Databricks. I have my lineage data pushed to MongoDB which is visualized on a Spline UI set up on Azure HDInsight cluster. I am using the JAR file 'spline-web-0.3.9-exec-war.jar' which was downloaded from the Spline docs page: 'https://absaoss.github.io/spline/'

spline-ui-download

I can see the lineage data of the Spark Jobs populate on the Spline UI but when I click it to view the lineage graph, it goes into a perpetual loading state with the following error in the UI console.

_ERROR TypeError: "e is undefined" buildAttrTree attribute-list.component.ts:76 attrTree attribute-list.component.ts:34 set attribute-list.component.ts:34 GC bundle.vendors\~main.js:765 MM bundle.vendors\~main.js:801 MM bundle.vendors\~main.js:801 MM bundle.vendors\~main.js:801 JM bundle.vendors\~main.js:808 View_e1 e.ngfactory.js:45 updateDirectives bundle.vendors\~main.js:808 CM bundle.vendors\~main.js:801 DM bundle.vendors\~main.js:801 mM bundle.vendors\~main.js:801 CM bundle.vendors\~main.js:801 DM bundle.vendors\~main.js:801 NM bundle.vendors\~main.js:801 CM bundle.vendors\~main.js:801 DM bundle.vendors\~main.js:801 mM bundle.vendors\~main.js:801 CM bundle.vendors\~main.js:801 DM bundle.vendors\~main.js:801 NM bundle.vendors\~main.js:801 CM bundle.vendors\~main.js:801 DM bundle.vendors\~main.js:801 mM bundle.vendors\~main.js:801 CM bundle.vendors\~main.js:801 DM bundle.vendors\~main.js:801 NM bundle.vendors\~main.js:801 CM bundle.vendors\~main.js:801 detectChanges bundle.vendors\~main.js:758 tick bundle.vendors\~main.js:641 tick bundle.vendors\~main.js:641 next bundle.vendors\~main.js:641 invoke bundle.vendors\~main.js:2373 onInvoke bundle.vendors\~main.js:627 invoke bundle.vendors\~main.js:2373 run bundle.vendors\~main.js:2373 run bundle.vendors\~main.js:627 next bundle.vendors\~main.js:641 r bundle.vendors\~main.js:436 __tryOrUnsub bundle.vendors\~main.js:1070 next bundle.vendors\~main.js:1070 _next bundle.vendors\~main.js:1070 next bundle.vendors\~main.js:1070 next bundle.vendors\~main.js:1166 emit bundle.vendors\~main.js:436 pd bundle.vendors\~main.js:627 Md bundle.vendors\~main.js:627 onInvokeTask bundle.vendors\~main.js:627 invokeTask bundle.vendors\~main.js:2373 runTask bundle.vendors\~main.js:2373 invokeTask bundle.vendors\~main.js:2373 invoke bundle.vendors\~main.js:2373 0 bundle.vendors\~main.js:2493

Dashboard spline-ui-dashboard

Inspecting console spline-ui-inspect

Inspecting CSS spline-ui-css

UI Response with lineage data spline-ui-response

wajda commented 4 years ago

Interesting. Can you share the full JSON that you get from the REST call? (the one pictured on the last screenshot). Thanks.

devshank commented 4 years ago
{
  "JSON": {
    "operations": [
      {
        "_typeHint": "za.co.absa.spline.model.op.Composite",
        "mainProps": {
          "id": "b8aefdc7-f050-4b0a-b21b-f928bbe0bb78",
          "name": "Databricks Shell",
          "inputs": [],
          "output": "b8aefdc7-f050-4b0a-b21b-f928bbe0bb78"
        },
        "sources": [
          {
            "type": "CSV",
            "path": "dbfs:/FileStore/tables/date_dim.csv",
            "datasetsIds": []
          },
          {
            "type": "CSV",
            "path": "dbfs:/FileStore/tables/pa0000.csv",
            "datasetsIds": []
          },
          {
            "type": "CSV",
            "path": "dbfs:/FileStore/tables/pa0001.csv",
            "datasetsIds": []
          },
          {
            "type": "CSV",
            "path": "dbfs:/FileStore/tables/hrp1001.csv",
            "datasetsIds": []
          }
        ],
        "destination": {
          "type": "Parquet",
          "path": "dbfs:/mnt/lineage_data",
          "datasetsIds": [
            "b8aefdc7-f050-4b0a-b21b-f928bbe0bb78"
          ]
        },
        "timestamp": 1573727527842,
        "appId": "app-20191114065932-0000",
        "appName": "Databricks Shell"
      }
    ],
    "datasets": [
      {
        "id": "b8aefdc7-f050-4b0a-b21b-f928bbe0bb78",
        "schema": {
          "attrs": [
            "9f7ea420-95bb-4d56-a948-3765f97af2df",
            "03e81a6a-6b3c-4005-849b-b4329b9518b6",
            "7013fbb3-f672-457e-8c18-e9cc48015372",
            "3580185c-b9cc-47a3-abab-324441494583",
            "528aa75a-9003-4f83-8576-1355f3575c7d",
            "4f347a7f-7cda-439e-9e84-f3df4cbdd89d"
          ]
        }
      }
    ],
    "attributes": [],
    "dataTypes": []
  },
  "Response payload": {
    "EDITOR_CONFIG": {
      "text": "{\"operations\":[{\"_typeHint\":\"za.co.absa.spline.model.op.Composite\",\"mainProps\":{\"id\":\"b8aefdc7-f050-4b0a-b21b-f928bbe0bb78\",\"name\":\"Databricks Shell\",\"inputs\":[],\"output\":\"b8aefdc7-f050-4b0a-b21b-f928bbe0bb78\"},\"sources\":[{\"type\":\"CSV\",\"path\":\"dbfs:/FileStore/tables/date_dim.csv\",\"datasetsIds\":[]},{\"type\":\"CSV\",\"path\":\"dbfs:/FileStore/tables/pa0000.csv\",\"datasetsIds\":[]},{\"type\":\"CSV\",\"path\":\"dbfs:/FileStore/tables/pa0001.csv\",\"datasetsIds\":[]},{\"type\":\"CSV\",\"path\":\"dbfs:/FileStore/tables/hrp1001.csv\",\"datasetsIds\":[]}],\"destination\":{\"type\":\"Parquet\",\"path\":\"dbfs:/mnt/lineage_data\",\"datasetsIds\":[\"b8aefdc7-f050-4b0a-b21b-f928bbe0bb78\"]},\"timestamp\":1573727527842,\"appId\":\"app-20191114065932-0000\",\"appName\":\"Databricks Shell\"}],\"datasets\":[{\"id\":\"b8aefdc7-f050-4b0a-b21b-f928bbe0bb78\",\"schema\":{\"attrs\":[\"9f7ea420-95bb-4d56-a948-3765f97af2df\",\"03e81a6a-6b3c-4005-849b-b4329b9518b6\",\"7013fbb3-f672-457e-8c18-e9cc48015372\",\"3580185c-b9cc-47a3-abab-324441494583\",\"528aa75a-9003-4f83-8576-1355f3575c7d\",\"4f347a7f-7cda-439e-9e84-f3df4cbdd89d\"]}}],\"attributes\":[],\"dataTypes\":[]}",
      "mode": "application/json"
    }
  }
}
wajda commented 4 years ago

Looks like an incomplete server response. That's obviously a bug. But since Spline 0.3 is EOL I'd encourage you to try Spline 0.4 (develop branch), and see if the issue is still there.

devshank commented 4 years ago

Although, I have been able to visualize lineage successfully for a simple spark job which loads data from CSV files, does a simple inner join and writes file to DBFS. image4

... because of which I ended up comparing both the spark scripts. What I could find was, the spark shell which was generating the lineage was loading data to spark from a file ; the script that generated the lineage which is breaking the UI was loading data to spark from a table in DBFS.

Once I changed my spark code to read from a file. I was able to generate the lineage in the UI as shown below: image

Could you help me understand the root cause? Does spline not track the lineage if the starting point is a table and not a file.

Thanks.

wajda commented 4 years ago

It should, but apparently it wasn't tested enough in Spline 0.3. But there was a major rewrite in Spline 0.4, including adapters for different source types in Spark. Could you please try it with Spline 0.4? If the issue is still there it might be related specifically to DBFS support. We'll address it separately then.

wajda commented 4 years ago

@cerveada

devshank commented 4 years ago

Hi Alex, I could not find a 0.4 version for spline libraries in maven central.

image

I guess I will have to build it.

wajda commented 4 years ago

No, it's not released yet. You need to build it from sources. You need to have Maven and NodeJS installed. Then take a develop branch and run mvn install -DskipTests

See the detailed instructions here - https://github.com/AbsaOSS/spline/blob/develop/examples/README.md (use 0.4.0-SNAPSHOT instead of 0.4.0 in the commands where appropriate until 0.4.0 is released)

devshank commented 4 years ago

I tried the steps described in the README.md: image

I am getting a 404 status for the URL. image

cerveada commented 4 years ago

Hello, sorry for confusion. The readme is prepared for version 0.4 which should be released soon, but wasn't yet.

I think Alex meant you should clone the repo and build directly from develop branch.

devshank commented 4 years ago

Hello Adam, I tried that too. The build still seems to fail: image

devshank commented 4 years ago

Here is the DEBUG logs:

[INFO] Executing tasks
Build sequence for target(s) `main' is [main]
Complete build sequence is [main, ]

main:
     [echo] installing redoc-cli
     [exec] Current OS is Linux
     [exec] Executing 'npm' with arguments:
     [exec] 'install'
     [exec] 'redoc-cli'
     [exec] '--no-color'
     [exec]
     [exec] The ' characters around the executable and arguments are
     [exec] not part of the command.
Execute:Java13CommandLauncher: Executing 'npm' with arguments:
'install'
'redoc-cli'
'--no-color'

The ' characters around the executable and arguments are
not part of the command.
     [exec] WARN engine redoc-cli@0.9.2: wanted: {"node":">= 8"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine redoc-cli@0.9.2: wanted: {"node":">= 8"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine chokidar@3.3.0: wanted: {"node":">= 8.10.0"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine redoc@2.0.0-rc.16: wanted: {"node":">=6.9","npm":">=3.0.0"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine glob-parent@5.1.0: wanted: {"node":">= 6"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine anymatch@3.1.1: wanted: {"node":">= 8"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine readdirp@3.2.0: wanted: {"node":">= 8"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine braces@3.0.2: wanted: {"node":">=8"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine fsevents@2.1.2: wanted: {"node":"^8.16.0 || ^10.6.0 || >=11.0.0"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine is-binary-path@2.1.0: wanted: {"node":">=8"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine picomatch@2.1.1: wanted: {"node":">=8.6"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine fill-range@7.0.1: wanted: {"node":">=8"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine to-regex-range@5.0.1: wanted: {"node":">=8.0"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine binary-extensions@2.0.0: wanted: {"node":">=8"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine react-hot-loader@4.12.17: wanted: {"node":">= 6"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine source-map@0.7.3: wanted: {"node":">= 8"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine yaml@1.7.2: wanted: {"node":">= 6"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine leven@3.1.0: wanted: {"node":">=6"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine find-up@3.0.0: wanted: {"node":">=6"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine os-locale@3.1.0: wanted: {"node":">=6"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine locate-path@3.0.0: wanted: {"node":">=6"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine p-locate@3.0.0: wanted: {"node":">=6"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine p-limit@2.2.1: wanted: {"node":">=6"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine p-try@2.2.0: wanted: {"node":">=6"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine execa@1.0.0: wanted: {"node":">=6"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine mem@4.3.0: wanted: {"node":">=6"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine lcid@2.0.0: wanted: {"node":">=6"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine get-stream@4.1.0: wanted: {"node":">=6"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine cross-spawn@6.0.5: wanted: {"node":">=4.8"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine p-is-promise@2.1.0: wanted: {"node":">=6"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine map-age-cleaner@0.1.3: wanted: {"node":">=6"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine mimic-fn@2.1.0: wanted: {"node":">=6"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine camelcase@5.3.1: wanted: {"node":">=6"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine @babel/parser@7.7.3: wanted: {"node":">=6.0.0"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine get-caller-file@2.0.5: wanted: {"node":"6.* || 8.* || >= 10.*"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine string-width@3.1.0: wanted: {"node":">=6"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine strip-ansi@5.2.0: wanted: {"node":">=6"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine wrap-ansi@5.1.0: wanted: {"node":">=6"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] WARN engine ansi-regex@4.1.0: wanted: {"node":">=6"} (current: {"node":"4.2.6","npm":"3.5.2"})
     [exec] /home/sshuser/spline/rest-gateway
     [exec] └── redoc-cli@0.9.2
     [exec]
     [exec] npm WARN optional Skipping failed optional dependency /chokidar/fsevents:
     [exec] npm WARN notsup Not compatible with your operating system or architecture: fsevents@2.1.2
     [exec] npm WARN enoent ENOENT: no such file or directory, open '/home/sshuser/spline/rest-gateway/package.json'
     [exec] npm WARN react-hot-loader@4.12.17 requires a peer of @types/react@^15.0.0 || ^16.0.0 but none was installed.
     [exec] npm WARN rest-gateway No description
     [exec] npm WARN rest-gateway No repository field.
     [exec] npm WARN rest-gateway No README data
     [exec] npm WARN rest-gateway No license field.
     [echo] generate consumer documentation
     [exec] Current OS is Linux
     [exec] Executing 'node_modules/.bin/redoc-cli' with arguments:
     [exec] 'bundle'
     [exec] '-o'
     [exec] '/home/sshuser/spline/rest-gateway/target/api/docs/consumer.html'
     [exec] '/home/sshuser/spline/rest-gateway/target/api/docs/consumerSwagger.json'
     [exec]
     [exec] The ' characters around the executable and arguments are
     [exec] not part of the command.
Execute:Java13CommandLauncher: Executing 'node_modules/.bin/redoc-cli' with arguments:
'bundle'
'-o'
'/home/sshuser/spline/rest-gateway/target/api/docs/consumer.html'
'/home/sshuser/spline/rest-gateway/target/api/docs/consumerSwagger.json'

The ' characters around the executable and arguments are
not part of the command.
     [exec] /home/sshuser/spline/rest-gateway/node_modules/redoc-cli/index.js:122
     [exec] function serve(port, pathToSpec, options = {}) {
     [exec]                                          ^
     [exec]
     [exec] SyntaxError: Unexpected token =
     [exec]     at exports.runInThisContext (vm.js:53:16)
     [exec]     at Module._compile (module.js:374:25)
     [exec]     at Object.Module._extensions..js (module.js:417:10)
     [exec]     at Module.load (module.js:344:32)
     [exec]     at Function.Module._load (module.js:301:12)
     [exec]     at Function.Module.runMain (module.js:442:10)
     [exec]     at startup (node.js:136:18)
     [exec]     at node.js:966:3
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Spline ............................................. SUCCESS [  1.299 s]
[INFO] commons ............................................ SUCCESS [  1.993 s]
[INFO] producer-rest-model ................................ SUCCESS [  0.042 s]
[INFO] spark-agent ........................................ SUCCESS [ 10.409 s]
[INFO] migrator-tool ...................................... SUCCESS [01:22 min]
[INFO] rest-api-doc-generator ............................. SUCCESS [01:03 min]
[INFO] persistence ........................................ SUCCESS [  3.639 s]
[INFO] producer-services .................................. SUCCESS [  0.078 s]
[INFO] consumer-services .................................. SUCCESS [  0.125 s]
[INFO] producer-rest-core ................................. SUCCESS [  4.212 s]
[INFO] consumer-rest-core ................................. SUCCESS [  0.081 s]
[INFO] rest-gateway ....................................... FAILURE [ 28.563 s]
[INFO] admin .............................................. SKIPPED
[INFO] client-ui .......................................... SKIPPED
[INFO] absaoss-spline-client .............................. SKIPPED
[INFO] client-web ......................................... SKIPPED
[INFO] spark-bundle-2.2 ................................... SKIPPED
[INFO] spark-bundle-2.3 ................................... SKIPPED
[INFO] spark-bundle-2.4 ................................... SKIPPED
[INFO] examples ........................................... SKIPPED
[INFO] integration-tests .................................. SKIPPED
[INFO] spline ............................................. SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 03:16 min
[INFO] Finished at: 2019-11-15T13:06:24+00:00
[INFO] Final Memory: 88M/1714M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.8:run (default) on project rest-gateway: An Ant BuildException has occured: exec returned: 1
[ERROR] around Ant part ...<exec failonerror="true" executable="node_modules/.bin/redoc-cli" osfamily="unix">... @ 15:85 in /home/sshuser/spline/rest-gateway/target/antrun/build-main.xml
[ERROR] -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.8:run (default) on project rest-gateway: An Ant BuildException has occured: exec returned: 1
around Ant part ...<exec failonerror="true" executable="node_modules/.bin/redoc-cli" osfamily="unix">... @ 15:85 in /home/sshuser/spline/rest-gateway/target/antrun/build-main.xml
        at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:212)
        at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
        at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
        at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116)
        at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80)
        at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
        at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128)
        at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:307)
        at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:193)
        at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:106)
        at org.apache.maven.cli.MavenCli.execute(MavenCli.java:863)
        at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:288)
        at org.apache.maven.cli.MavenCli.main(MavenCli.java:199)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289)
        at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229)
        at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415)
        at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356)
Caused by: org.apache.maven.plugin.MojoExecutionException: An Ant BuildException has occured: exec returned: 1
around Ant part ...<exec failonerror="true" executable="node_modules/.bin/redoc-cli" osfamily="unix">... @ 15:85 in /home/sshuser/spline/rest-gateway/target/antrun/build-main.xml
        at org.apache.maven.plugin.antrun.AntRunMojo.execute(AntRunMojo.java:342)
        at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134)
        at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:207)
        ... 20 more
Caused by: /home/sshuser/spline/rest-gateway/target/antrun/build-main.xml:15: exec returned: 1
        at org.apache.tools.ant.taskdefs.ExecTask.runExecute(ExecTask.java:643)
        at org.apache.tools.ant.taskdefs.ExecTask.runExec(ExecTask.java:669)
        at org.apache.tools.ant.taskdefs.ExecTask.execute(ExecTask.java:495)
        at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:292)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
        at org.apache.tools.ant.Task.perform(Task.java:348)
        at org.apache.tools.ant.Target.execute(Target.java:435)
        at org.apache.tools.ant.Target.performTasks(Target.java:456)
        at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1393)
        at org.apache.tools.ant.Project.executeTarget(Project.java:1364)
        at org.apache.maven.plugin.antrun.AntRunMojo.execute(AntRunMojo.java:313)
        ... 22 more
[ERROR]
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn <goals> -rf :rest-gateway
cerveada commented 4 years ago

The build-main.xml:15 referenced in the log is running redoc-cli.

You seem to be using node 4, but redoc wants 8 or more:

[exec] WARN engine redoc-cli@0.9.2: wanted: {"node":">= 8"} (current: {"node":"4.2.6","npm":"3.5.2"})

I would try to update node to version 9 or higher (I'm using 12) and try to build it again.

devshank commented 4 years ago

I tried to build using the node v10. I noticed it uses the ArangoDB instead of MongoDB.

I am stuck at the point where we bring up spline UI using the command:

java -jar admin/target/admin-0.4.0.jar db-init arangodb://localhost/spline

(refer https://github.com/AbsaOSS/spline/blob/develop/examples/README.md)

I am getting a 401:Unauthorized status response. Please help.

cerveada commented 4 years ago

Ok, If you built spline successfully you should have three main artifacts: rest-gateway and client-web that need to be run on Tomcat and spark-agent that you need to use as dependency in your spark app as you can see it in examples.

You also need the Arango db runing. The command in your last comment is actually supposed to initialize tables in database, so you need to provide correct URL for Arango.

It's important to provide properties for all the application, gateway needs to know URL of db and the other two needs to know URL of the rest APIs.

Spline Schema-1

wajda commented 4 years ago

I am getting a 401:Unauthorized status response. Please help

Do you have ArangoDB authentication enabled? If yes you need to include user/passwd to the conection URL. E.g. arangodb://user:password@localhost/spline . Or disable ArangoDB auth

devshank commented 4 years ago

The maven build is failing after it builds the admin-0.4.0-SNAPSHOT.jar. Also the other modules require Darwin whereas I am using an Ubuntu OS. So I decided to go with the docker approach. I have set up Spline UI and the rest server.

I already have a lineage data recorded using spline v0.3.9 as MongoDB dump. Will I be able to import this dump to ArangoDB visualise the data in the UI?

wajda commented 4 years ago

Absolutely:

  1. Have your Spline Gateway running
  2. Execute
    java -jar migrator-tool-0.4.0-SNAPSHOT.jar -s mongodb://...... -t http://localhost:8080/producer

    where 'http://localhost:8080/' is the root of Spline Gateway URL

For more info run:

java -jar migrator-tool-0.4.0-SNAPSHOT.jar --help
devshank commented 4 years ago

I am able to view the Spline UI as shown:

image

When I try to view the lineage graph by opening one of the jobs, I get an error like:

image

Also, I could see this appear on the ArangoDB console.


_11:45:47.451 [ForkJoinPool-1-worker-7] WARN z.c.a.spline.persistence.Persister$ - Got an error, retrying... (4 attempts left): Response: 409, Error: 1210 - unique constraint violated - in index idx_1650630487667900416 of type hash over 'uri'; conflicting key: b94c8f00-24df-4cfc-9111-84abfd9186f0_

devshank commented 4 years ago

Also, I could not find this dependency on Maven Central.

image

wajda commented 4 years ago

4.0.0 is a typo, should be 0.4.0. But as it was mentioned a few time above please do not take README verbatim. It is prepared for 0.4.0 release, which is not published yet. As well as for the Docker, it only contains a few week old snapshot images for testing purposes. So please build Spline locally if you want a fresh stuff. Ubuntu is perfectly fine for that, it should just work.

As for the WARN z.c.a.spline.persistence.Persister$ - Got an error, retrying... it's a warning, not an error. It's benign. It's basically an optimistic concurrency control in action.

devshank commented 4 years ago

Maven build does not seem to be completed successfully. Build is failing with the following error:

image

This happens while building of client-ui

image

devshank commented 4 years ago

Also, once spline is set up, how do you configure a Spark application to record lineage to Spline?

Readme suggests to use: spline.mode=REQUIRED spline.producer.url=http://localhost:8080/spline

Older config: spline.mode=REQUIRED spline.persistence.factory=za.co.absa.spline.persistence.mongo.MongoPersistenceFactory spline.mongodb.url=mongodb://... spline.mongodb.name=dbname

I encountered an error while configuring this and running the Spark Job.

image

Some times this error does not occur. At this point, I executed the spark job but the lineage was not getting populated: image

It feels like lineage tracking is not getting enabled. Is there any particular step we should follow to get this active?

Thanks.

wajda commented 4 years ago

can you try to execute this from the client-ui directory?

npm install
npm run build-prod
wajda commented 4 years ago

spline.persistence.factory is a config property of the old Spline. Looks like it's still in your classpath. Please make sure you have no reference to the old binaries from anywhere. Check Spark libs/, the spark-sumbit command or your uber jar (if you have any)

devshank commented 4 years ago

Lets say I want to track lineage for a spark job running on a Databricks cluster; how do I enable spline listener on the Databricks cluster. What modules need I install to have the following configurations active:

spline.mode=REQUIRED spline.producer.url=http://localhost:8080/spline

In the README, it talks about a 'spark-agent'. Is this the module to be installed on Databricks cluster to enable spline listener?

devshank commented 4 years ago

I tried

npm install

Looks like it is throwing the same error as that observed during the build failure.


npm install

@angular/cli@8.2.0 postinstall /root/spline/client-ui/node_modules/@angular/cli node ./bin/postinstall/script.js

internal/modules/cjs/loader.js:638 throw err; ^

Error: Cannot find module '/root/spline/client-ui/node_modules/@angular/cli/bin/postinstall/script.js' at Function.Module._resolveFilename (internal/modules/cjs/loader.js:636:15) at Function.Module._load (internal/modules/cjs/loader.js:562:25) at Function.Module.runMain (internal/modules/cjs/loader.js:831:12) at startup (internal/bootstrap/node.js:283:19) at bootstrapNodeJSCore (internal/bootstrap/node.js:623:3) npm WARN bootstrap@4.3.1 requires a peer of popper.js@^1.14.7 but none is installed. You must install peer dependencies yourself. npm WARN cytoscape-ng-lib@2.0.1 requires a peer of @angular/common@^7.0.0 but none is installed. You must install peer dependencies yourself. npm WARN cytoscape-ng-lib@2.0.1 requires a peer of @angular/core@^7.0.0 but none is installed. You must install peer dependencies yourself. npm WARN ngx-bootstrap-switch@0.0.3 requires a peer of @angular/core@>=4.0.0 <8.0.0 but none is installed. You must install peer dependencies yourself. npm WARN optional SKIPPING OPTIONAL DEPENDENCY: fsevents@1.2.9 (node_modules/chokidar/node_modules/fsevents): npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for fsevents@1.2.9: wanted {"os":"darwin","arch":"any"} (current: {"os":"linux","arch":"x64"}) npm WARN optional SKIPPING OPTIONAL DEPENDENCY: fsevents@1.2.9 (node_modules/@angular/compiler-cli/node_modules/fsevents): npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for fsevents@1.2.9: wanted {"os":"darwin","arch":"any"} (current: {"os":"linux","arch":"x64"})

npm ERR! code ELIFECYCLE npm ERR! errno 1 npm ERR! @angular/cli@8.2.0 postinstall: node ./bin/postinstall/script.js npm ERR! Exit status 1 npm ERR! npm ERR! Failed at the @angular/cli@8.2.0 postinstall script. npm ERR! This is probably not a problem with npm. There is likely additional logging output above.

npm ERR! A complete log of this run can be found in: npm ERR! /root/.npm/_logs/2019-11-21T05_29_03_641Z-debug.log

wajda commented 4 years ago

Hi @devshank, yes I see that we definitely need to work more on the documentation, and we'll do it shortly.

As for how to use Spline Agent for Spark, there are several ways, choose what suits you the best:

See the PySpark example for reference - https://github.com/AbsaOSS/spline/blob/develop/examples/src/main/scala/za/co/absa/spline/example/batch/python_example.py

wajda commented 4 years ago

For the NPM issue, could you try to remove the node_modules directory, upgrade Node to the latest recommended version (12.13.1 at the moment) and then try over?

devshank commented 4 years ago

@wajda I am using the node v10 and v12 both. I think I have found the issue in enabling the lineage listener. The spark-agent.jar has a different folder structure as that in the previous versions.

The SparkLineageInitializer class is present in the path za\co\absa\spline\harvester rather than za\co\absa\spline\core.

So the statement:

spark._jvm.za.co.absa.spline.core.SparkLineageInitializer.enableLineageTracking(spark._jsparkSession)

needs to be changed to:

spark._jvm.za.co.absa.spline.harvester.SparkLineageInitializer.enableLineageTracking(spark._jsparkSession)

Thanks,

wajda commented 4 years ago

Thanks, will fix.

Regarding NodeJS, yeah it looks like there is something wrong with your environment. We build Spline on different Linux distros (including Ubuntu), MacOS and Windows without any issues.

devshank commented 4 years ago

I have bypassed this issue by using the docker containers for REST server and Client UI.

devshank commented 4 years ago

Hi @wajda . I am using the latest version of node. I still cannot build this code using maven without failure. Now the build seems to fail at 'commons' folder. I have attached the logs below:

image

wajda commented 4 years ago

Hi @devshank, Back to the original issue, have you been able to try your case in Spline 0.4?

BTW Spline 0.4.0 is now released, and all artifacts are available in public Maven and Docker repos. So now it's easy to try it out :)