Closed SemyonSinchenko closed 8 months ago
I think we can discard the Option
in VertexWriter
constructor, and just pass a Long
like
class VertexWriter(
prefix: String,
vertexInfo: VertexInfo,
vertexDf: DataFrame,
numVertices: Long = -1
) {
private val vertexNum: Long = numVertices match {
case x if x < 0 =>vertexDf.count()
case _ => numVertices
}
The solution make sense since we do not want user to pass a negative value as vertex num. What do you think?
I think we can discard the
Option
inVertexWriter
constructor, and just pass aLong
likeclass VertexWriter( prefix: String, vertexInfo: VertexInfo, vertexDf: DataFrame, numVertices: Long = -1 ) { private val vertexNum: Long = numVertices match { case x if x < 0 =>vertexDf.count() case _ => numVertices }
The solution make sense since we do not want user to pass a negative value as vertex num. What do you think?
I like it! I can implement it, may you assign this task to me? I will update the code and the docstring, and I will check the documentation if it is used there. Thank you!
Is your feature request related to a problem? Please describe. Currently the
scala.Option[Long]
is used in constructor ofVertexWriter
. It is good from the scala point of view but there is a problem when you try to call this constructor from python withpy4j
. The reason is thatpy4j
make autoboxing and autounboxing of Java primitives. Let's imagine we havepy4j.java_gateway.JVMView
object in python, that is namejvm
. Even if you try to pass into constructor something likejvm.scala.Some(jvm.java.lang.Long.valueOf(100))
it will run the instruction in parentheses first, getjava.lang.Long
as a result, make autounboxing it into pythonint
and passint
intoSome
with autoboxing it tojava.lang.Integer
that produceSome[Int]
, notSome[Long]
. And, because of usingOption
in constructor,py4j
cannot understand what is expected type here and cannot make autoboxing ofint
intojvm.java.lang.Long
. I tried different tricks, but each time I'm gettingjava.lang.Integer cannot be cast to java.lang.Long
. There is a similar discussion inpy4j
issues, which is open from 2019 and it looks like there is no solution. So, currently, the only option to make aVertexWriter
from python is passjvm.scala.Option.empty()
and no way to pass optional parameternumVertices
.Describe the solution you'd like Make tow constructor of
VertexWriter
. The first one likeVertexWriter(prefix: String, vertexInfo: VertexInfo, vertexDf: DataFrame, numVertices: Long)
and the second one likeVertexWriter(prefix: String, vertexInfo: VertexInfo, vertexDf: DataFrame)
.Describe alternatives you've considered Alternative solution is to add an additional method into
VertexInfo
companion object, likecreateFromPy4j(prefix: String, vertexInfo: VertexInfo, vertexDf: DataFrame, numVertices: Long): VertexInfo
. It allows to decide on the python side, which one to call: if python user providenumVertices
than call a fabric-method from companion object, otherwise call constructor withjvm.scala.Option.empty()
.Additional context I can do it by myself. I'm sure, that no additional changes are required and all spark scala API will be the same.
P.S. I understand, that such a way is not a "scala way", but in my experience each spark library sooner or later face the question of creating PySpark bindings. It looks OK for me to sacrifice some scala code beauty to get better compatibility with PySpark...