Closed wey-gu closed 6 months ago
it depends on spark connector with spark 3.0, not implemented but in the schedule.
@Nicole00 I noticed there's already a pull request of connector supporting spark 3.0. When that one is done, any further work here in order to run algos in spark 3?
Any update on this?
We cannot use nebula-algorithm since our Spark-Operator framework is spark 3.0.
Any update on this?
We cannot use nebula-algorithm since our Spark-Operator framework is spark 3.0.
At present, you can do this temporarily: pull branch and execute maven install for nebula-spark-connector_3.0 , and update the spark connector version referenced in algorithm
Any update on this?
We cannot use nebula-algorithm since our Spark-Operator framework is spark 3.0.
I also encountered the same problem。The spark version used by our platform is 3.x (3.2.2)。 My algorithm does not use nebula data directly, but uses the results of business query nebula. It can be thought of as only data containing vertices, end points, and weights. I only extracted the source code of nebula and nebula-spark-connector used in the algorithm to my project, and relied on my own, which can run the algorithm like pagerank normally.
My main modifications are as follows:
├── base
│ └── client
│ ├── meta_data
│ │ ├── FieldMetaData.java
│ │ └── FieldValueMetaData.java
│ ├── protocol
│ │ ├── ShortStack.java
│ │ ├── TCompactProtocol.java
│ │ ├── TException.java
│ │ ├── TField.java
│ │ ├── TList.java
│ │ ├── TMap.java
│ │ ├── TMessage.java
│ │ ├── TProtocol.java
│ │ ├── TProtocolException.java
│ │ ├── TProtocolFactory.java
│ │ ├── TSet.java
│ │ ├── TStruct.java
│ │ └── TTransportException.java
│ ├── schema
│ │ ├── IScheme.java
│ │ ├── SchemeFactory.java
│ │ └── StandardScheme.java
│ ├── thrift
│ │ └── TBase.java
│ └── transport
│ ├── TException.java
│ ├── TTransport.java
│ └── TTransportException.java
├── config
│ ├── AlgoConfig.scala
│ └── SparkConfigEntry.scala
├── examples
│ └── PageRankExample.scala
├── lib
│ └── PageRankAlgo.scala
├── reader
│ └── ReadData.scala
└── utils
├── DecodeUtil.scala
└── NebulaUtil.scala
13 directories, 29 files
2. pom.xml
<properties>
<maven.compiler.source>8</maven.compiler.source>
<maven.compiler.target>8</maven.compiler.target>
<scala.version>2.12</scala.version>
<spark.version>3.2.2</spark.version>
<lombok.version>1.18.28</lombok.version>
<config.version>1.4.0</config.version>
<scopt.version>3.7.1</scopt.version>
</properties>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${scala.version}</artifactId>
<version>${spark.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_${scala.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql-kafka-0-10_${scala.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-graphx_${scala.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>com.typesafe</groupId>
<artifactId>config</artifactId>
<version>${config.version}</version>
</dependency>
<dependency>
<groupId>com.github.scopt</groupId>
<artifactId>scopt_${scala.version}</artifactId>
<version>${scopt.version}</version>
</dependency>
Just follow this method to add your algorithm source code and update your own dependencies
Hope it helps you
Can we take this as a higher priority?
like https://github.com/vesoft-inc/nebula-exchange/pull/41
update: 2023-April, spark connector supports spark 3.0 now.