baldandbrave / RecSysCOEN6313

Recommendation system
0 stars 2 forks source link

Import dataset to db and set compute workflows #12

Open baldandbrave opened 6 years ago

baldandbrave commented 6 years ago

After #10 and building the spark-mongodb system, we need to import the csv file dataset into the database. @smcheraghi Please look into the docs of spark, spark-mongodb connector, and mongodb. Spark has APIs in several programming langs including Java and Python, choose the one you are familiar. @Dvangelion Me and @smcheraghi will collaborate in optimizing the compute workflow in Spark, please check any machine learning tools you might use in Spark.

Useful links: Spark-mongodb connector doc Spark doc Mongodb doc Spark ML libs doc

Dvangelion commented 6 years ago

The amazon dataset is actually in json format. And we need to come up with some other algorithms using spark, since atrank only supports tensorflow.

baldandbrave commented 6 years ago

The amazon dataset is actually in json format. And we need to come up with some other algorithms using spark, since atrank only supports tensorflow.

I think this connector may help.