AbsaOSS / spline-spark-agent

Spline agent for Apache Spark
https://absaoss.github.io/spline/
Apache License 2.0
185 stars 95 forks source link

Spline doesn't support MaxCompute table storage? #820

Closed XavierZYXue closed 2 months ago

XavierZYXue commented 3 months ago

when I try to use MaxCompute table as spark storage, the agent cannot work well. I found that there is no relevant plugin of MaxCompute(Ali Cloud) in this repo, so it doesn't support right? any plan for this or what?

wajda commented 3 months ago

We never tried it on MaxCompute or any other Alibaba services, so I cannot tell anything on that matter. There are two different APIs in Spark (called DataSource API V1 and V2 respectively) through which Spark interacts with 3rd-party data providers, and through which Spline can receive metadata and capture lineage. If MaxCompute connector uses V2 then Spline should in theory support it out of the box. If it uses V1 then a separate Spline agent plugin would likely be required to capture lineage from it. It needs more investigation. But as I mentioned in the other ticket, if you could contribute with code we would try to provide maximum support.