actionml / universal-recommender

Highly configurable recommender based on PredictionIO and Mahout's Correlated Cross-Occurrence algorithm
http://actionml.com/universal-recommender
Apache License 2.0
669 stars 172 forks source link

UR is using elastic even when all the data sources are MYSQL #60

Open yogeshExplore opened 6 years ago

yogeshExplore commented 6 years ago

This is my pio-env.sh file

!/usr/bin/env bash

BASIC

SPARK_HOME=/usr/lib/spark

MYSQL_JDBC_DRIVER=/usr/share/java/mysql-connector-java.jar

PIO_FS_BASEDIR=$_LINIO_HOME/.pio_store PIO_FS_ENGINESDIR=$PIO_FS_BASEDIR/engines PIO_FS_TMPDIR=$PIO_FS_BASEDIR/tmp

PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=MYSQL

PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=MYSQL

PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=MYSQL

Storage Data Sources

PIO_STORAGE_SOURCES_MYSQL_TYPE=jdbc PIO_STORAGE_SOURCES_MYSQL_URL=jdbc:mysql://10.60../pio_models PIO_STORAGE_SOURCES_MYSQL_USERNAME= PIO_STORAGE_SOURCES_MYSQL_PASSWORD=


VERSIONS

PIO_VERSION=0.12.1 PIO_SPARK_VERSION=2.3.1 PIO_ELASTICSEARCH_VERSION=5.6 HBASE_VERSION=1.4.6 PIO_HADOOP_VERSION=2.8.4 ZOOKEEPER=3.4.12 PYTHON_VERSION=3.6.3

ERROR WHILE RUNNING ./examples/integration-test

[INFO] [Engine$] EngineWorkflow.train [INFO] [Engine$] DataSource: com.actionml.DataSource@175ac243 [INFO] [Engine$] Preparator: com.actionml.Preparator@1073c664 [INFO] [Engine$] AlgorithmList: List(com.actionml.URAlgorithm@152e7703) [INFO] [Engine$] Data sanity check is on. [INFO] [DataSource] Received events List(purchase, view, category-pref) [INFO] [Engine$] com.actionml.TrainingData does not support data sanity check. Skipping check. [INFO] [Preparator] EventName: purchase [INFO] [Preparator] Downsampled users for minEventsPerUser: Some(3), eventName: purchase number of passing user-ids: 3 [INFO] [Preparator] Dimensions rows : 4 columns: 7 [INFO] [Preparator] Downsampled columns for users who pass minEventPerUser: Some(3), eventName: purchase number of user-ids: 3 [INFO] [Preparator] Dimensions rows : 3 columns: 6 [INFO] [Preparator] EventName: view [INFO] [Preparator] Dimensions rows : 3 columns: 4 [INFO] [Preparator] Number of user-ids after creation: 3 [INFO] [Preparator] EventName: category-pref [INFO] [Preparator] Dimensions rows : 3 columns: 2 [INFO] [Preparator] Number of user-ids after creation: 3 [INFO] [Engine$] com.actionml.PreparedData does not support data sanity check. Skipping check. [INFO] [URAlgorithm] Actions read now creating correlators [INFO] [PopModel] PopModel popular using end: 2018-09-18T18:54:23.466Z, and duration: 315360000, interval: 2008-09-20T18:54:23.466Z/2018-09-18T18:54:23.466Z [INFO] [PopModel] PopModel getting eventsRDD for startTime: 2008-09-20T18:54:23.466Z and endTime 2018-09-18T18:54:23.466Z [INFO] [URAlgorithm] Correlators created now putting into URModel [INFO] [URAlgorithm] Index mappings for the Elasticsearch URModel: Map(expires -> (date,false), date -> (date,false), category-pref -> (keyword,true), available -> (date,false), purchase -> (keyword,true), popRank -> (float,false), view -> (keyword,true)) [INFO] [URModel] Converting cooccurrence matrices into correlators [INFO] [URModel] Group all properties RDD [INFO] [URModel] ES fields[11]: List(categories, countries, date, id, expires, category-pref, available, purchase, popRank, defaultRank, view) [INFO] [EsClient$] Create new index: urindex_1537296874822, items, List(categories, countries, date, id, expires, category-pref, available, purchase, popRank, defaultRank, view), Map(expires -> (date,false), date -> (date,false), category-pref -> (keyword,true), available -> (date,false), purchase -> (keyword,true), popRank -> (float,false), view -> (keyword,true)) [INFO] [AbstractConnector] Stopped Spark@20177486{HTTP/1.1,[http/1.1]}{0.0.0.0:4040} Exception in thread "main" java.lang.IllegalStateException: No Elasticsearch client configuration detected, check your pio-env.sh forproper configuration settings at com.actionml.EsClient$$anonfun$client$2.apply(EsClient.scala:86) at com.actionml.EsClient$$anonfun$client$2.apply(EsClient.scala:86) at scala.Option.getOrElse(Option.scala:121) at com.actionml.EsClient$.client$lzycompute(EsClient.scala:85) at com.actionml.EsClient$.client(EsClient.scala:85) at com.actionml.EsClient$.createIndex(EsClient.scala:174) at com.actionml.EsClient$.hotSwap(EsClient.scala:271) at com.actionml.URModel.save(URModel.scala:82) at com.actionml.URAlgorithm.calcAll(URAlgorithm.scala:367) at com.actionml.URAlgorithm.train(URAlgorithm.scala:295) at com.actionml.URAlgorithm.train(URAlgorithm.scala:180) at org.apache.predictionio.controller.P2LAlgorithm.trainBase(P2LAlgorithm.scala:49) at org.apache.predictionio.controller.Engine$$anonfun$18.apply(Engine.scala:690) at org.apache.predictionio.controller.Engine$$anonfun$18.apply(Engine.scala:690) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.immutable.List.foreach(List.scala:381) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.immutable.List.map(List.scala:285) at org.apache.predictionio.controller.Engine$.train(Engine.scala:690) at org.apache.predictionio.controller.Engine.train(Engine.scala:176) at org.apache.predictionio.workflow.CoreWorkflow$.runTrain(CoreWorkflow.scala:67) at org.apache.predictionio.workflow.CreateWorkflow$.main(CreateWorkflow.scala:251) at org.apache.predictionio.workflow.CreateWorkflow.main(CreateWorkflow.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

musicformellons commented 6 years ago

On actionml webpage ElasticSearch is mentioned as being a requirement. I also quote form the actionml-user maillist from an answer on a similar question:

Again I caution you that Elasticsearch is required for the Universal Recommender because of a special type of query that no other type of DB supports, this query is part of the math that defines the algorithm for the UR. So if you use Cassandra with the UR you will only be able to replace HBase in the tech stack.