Closed razou closed 5 years ago
Is there a full stack trace?
There might be an issue with Spark version. We have tested it with Spark 2.3.2.
Yes, it's the full stack trace. Oh, sorry, for the mistake, I'm using spark 2.3.2. May be it's something related to zeppelin. I'll continue to test and investigate.
I would need more information to help you. Try directly accessing the TextTokenizer
object as follows:
com.salesforce.op.stages.impl.feature.TextTokenizer.tokenize(Text("hello world"))
Thanks @tovbinm It works outside of zeppelin.
Is there a mailling list where we can ask these king of questions instead of opening issue ticket ?
Hey @tovbinm have you tested this with TransmogrifAI 0.5.2
Seq(A, B, C).transmogrify()
where A, .., C are com.salesforce.op.features.Feature
at com.salesforce.op.utils.text.LuceneTextAnalyzer$.<init>(LuceneTextAnalyzer.scala:133)
at com.salesforce.op.utils.text.LuceneTextAnalyzer$.<clinit>(LuceneTextAnalyzer.scala)
at com.salesforce.op.stages.impl.feature.TextTokenizer$.<init>(TextTokenizer.scala:126)
at com.salesforce.op.stages.impl.feature.TextTokenizer$.<clinit>(TextTokenizer.scala)
at com.salesforce.op.stages.impl.feature.TransmogrifierDefaults$class.$init$(Transmogrifier.scala:85)
at com.salesforce.op.stages.impl.feature.TransmogrifierDefaults$.<init>(Transmogrifier.scala:90)
at com.salesforce.op.stages.impl.feature.TransmogrifierDefaults$.<clinit>(Transmogrifier.scala)
at com.salesforce.op.dsl.RichFeaturesCollection$RichAnyFeaturesCollection.transmogrify(RichFeaturesCollection.scala:70)
... 60 elided
NB: It works when I use 0.5.1 Thanks
works fine for me just fine from spark shell
$SPARK_HOME/bin/spark-shell --packages com.salesforce.transmogrifai:transmogrifai-core_2.11:0.5.2
...
Spark context available as 'sc' (master = local[*], app id = local-1555341642349).
Spark session available as 'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.3.3
/_/
Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_202)
Type in expressions to have them evaluated.
Type :help for more information.
scala> import com.salesforce.op.features.types._
import com.salesforce.op.features.types._
scala> com.salesforce.op.stages.impl.feature.TextTokenizer.tokenize(Text("hello world"))
res0: com.salesforce.op.stages.impl.feature.TextTokenizer.TextTokenizerResult = TextTokenizerResult(Unknown,List(TextList(hello, world)))
I think that my the problem is Zeppelin. Thanks
Does anybody experienced the following issue with the text tokenizer transformer?
Environment: Spark: 2.3.3 with Zeppelin 0.8.0 on AWS EMR Edit: I'm using
Spark 2.3.2
instead of2.3.3
andcom.salesforce.transmogrifai:transmogrifai-core_2.11:0.5.1
The error I'm getting
Thanks