Open michaltrmac opened 7 years ago
Hi,
Can you tell us the command you are using to train the model?
Hi,
i edit part of seldon.con file to look like this:
"cluster-by-dimension": {
"config": {
"inputPath": "%SELDON_MODELS%",
"outputPath": "%SELDON_MODELS%",
"activate":true,
"startDay" : 1,
"days" : 1,
"activate" : true,
"jdbc" : "jdbc:mysql://mysql:3306/client?user=root&characterEncoding=utf8",
"minActionsPerUser" : 0,
"delta" : 0.1,
"minClusterSize" : 200
},
"training": {
"job_info": {
"cmd": "%SPARK_HOME%/bin/spark-submit",
"cmd_args": [
"--class",
"io.seldon.spark.cluster.ClusterUsersByDimension",
"--master",
"spark://spark-master:7077",
"--driver-memory",
"8g",
"--executor-memory",
"8g",
"--total-executor-cores",
"12",
"%SELDON_SPARK_HOME%/seldon-spark-%SELDON_VERSION%-jar-with-dependencies.jar",
"--client",
"%CLIENT_NAME%",
"--zookeeper",
"%ZK_HOSTS%"
]
},
"job_type": "spark"
}
},
then i run
seldon-cli client --action processactions --client-name test2 --input-date-string 20161214
after that
seldon-cli model --action add --client-name test2 --model-name cluster-by-dimension --startDay 17149 --days 30
with output:
connecting to zookeeper-1:2181,zookeeper-2:2181 [SUCCEEDED]
Model [cluster-by-dimension] already added
adding config startDay : 17149
adding config days : 30
Writing data to file[/seldon-data/conf/zkroot/all_clients/test2/offline/cluster-by-dimension/_data_]
updated zk node[/all_clients/test2/offline/cluster-by-dimension]
and finally I run:
seldon-cli model --action train --client-name test2 --model-name cluster-by-dimension
Part of train output:
log4j:WARN No appenders could be found for logger (org.apache.curator.retry.ExponentialBackoffRetry).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Confguration from zookeeper -> {"activate":true,"days":30,"delta":0.1,"inputPath":"/seldon-data/seldon-models","jdbc":"jdbc:mysql://mysql:3306/client?user=root&password=mypass&characterEncoding=utf8","minActionsPerUser":0,"minClusterSize":200,"outputPath":"/seldon-data$
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
16/12/15 09:54:44 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/12/15 09:54:45 INFO Slf4jLogger: Slf4jLogger started
16/12/15 09:54:45 INFO Remoting: Starting remoting
16/12/15 09:54:45 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@192.168.37.186:37775]
16/12/15 09:54:46 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.
ClusterConfig(test2,/seldon-data/seldon-models,/seldon-data/seldon-models,17149,30,,,false,zookeeper-1:2181,zookeeper-2:2181,true,jdbc:mysql://mysql:3306/client?user=root&password=mypass&characterEncoding=utf8,0,0.1,200)
16/12/15 09:54:50 WARN ThrowableSerializationWrapper: Task exception could not be deserialized
java.lang.ClassNotFoundException: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:278)
at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1620)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1521)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1779)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373)
at org.apache.spark.ThrowableSerializationWrapper.readObject(TaskEndReason.scala:167)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1907)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1806)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2016)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1940)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1806)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2016)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1940)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1806)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:72)
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:98)
at org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply$mcV$sp(TaskResultGetter.scala:108)
at org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply(TaskResultGetter.scala:105)
at org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply(TaskResultGetter.scala:105)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699)
at org.apache.spark.scheduler.TaskResultGetter$$anon$3.run(TaskResultGetter.scala:105)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
16/12/15 09:54:50 ERROR TaskResultGetter: Could not deserialize TaskEndReason: ClassNotFound with classloader org.apache.spark.util.MutableURLClassLoader@51bc1897
Maybe I'm missing some settings or something, but the seldon docs is pretty confusing and there is not much about "most popular recommender".
I also found this file https://github.com/SeldonIO/seldon-server/blob/d1ec05a6f59b152eca438f1e67e2dc73a3879483/offline-jobs/spark/src/main/scala/io/seldon/spark/recommend/MostPopularJob.scala but don't know how to use it.
Thanks m.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
/lifecycle stale
Hi,
how can I setup seldon-server to recommend most popular items? I try to set it with "cluster-by-dimension" model, but it always faild on model train action with java.lang.ClassNotFoundException: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException
Thanks mt.