lakehq / sail

LakeSail's computation framework with a mission to unify stream processing, batch processing, and compute-intensive (AI) workloads.
https://lakesail.com
Apache License 2.0
520 stars 13 forks source link

Feature request : add support for spark connect without pysark dependencies #198

Open djouallah opened 2 months ago

djouallah commented 2 months ago

it will be really useful to be able to connect to Sail Server without requiring to install pyspark first

shehabgamin commented 2 months ago

@djouallah Would the Standalone Server be what you're looking for? You can check it out here: https://docs.lakesail.com/sail/latest/guide/installation/#standalone-server

If not, we'd love to learn more about your needs. And if you're interested, we'd be excited to have your contributions!

djouallah commented 2 months ago

i meant, if will be nice to use spark connect python client that does not depends on pyspark, in an ideal scenario, java will not be required at all

shehabgamin commented 2 months ago

Ah, I see. We're actively working on a Python client using Ibis (https://ibis-project.org/) to eliminate the need for PySpark.

linhr commented 2 months ago

Currently, to use the Python Spark Connect client, the pyspark Python package is needed but Java doesn't need to be installed since the jar files in the Python package will not be used. Spark 4.0 will also support a pure Python Spark Connect client via pyspark-connect. This is a lightweight package without jars. Would that be suitable for your needs?

djouallah commented 2 months ago

Currently, to use the Python Spark Connect client, the pyspark Python package is needed but Java doesn't need to be installed since the jar files in the Python package will not be used. Spark 4.0 will also support a pure Python Spark Connect client via pyspark-connect. This is a lightweight package without jars. Would that be suitable for your needs?

that's exactly what I am asking for !!!