Open hhbyyh opened 7 years ago
Hi @yiheng The complain actually comes from the customer Gigaspace. I guess when they could not find the prediction part for the Text classification example. They tried to implement something by themselves and met some problems.
IMO, we may either link the udfPredictor to the text classification, or implement the rdd-based prediction part for text classification.
I see. There's an udf example in the BigDL-Tutorial. See this PR: https://github.com/intel-analytics/BigDL-Tutorials/pull/20. But I feel it's too complex. Does it meet your requirement?
@yangw1234 Can we make the udf example simpler? like on scala notebook?
@yiheng I'll try and do that.
Great, we're trying to add scala notebook(backend is toree) to BigDL-Tutorial. It will be much more friendly for user to learn how to use BigDL through notebook. One thing, I'm not sure how to handle java package dependency in toree based scala notebook. Is there something like pip in JVM?
Or we can use mvn repository as java class path
@hhbyyh Does this meet your requirement. Basically, it trains a text classifier model and uses it in DataFrame query.
https://github.com/intel-analytics/BigDL/blob/master/spark/dl/src/main/scala/com/intel/analytics/bigdl/example/udfpredictor/DataframePredictor.scala