guoguibing / librec

LibRec: A Leading Java Library for Recommender Systems, see
https://www.librec.net/
Other
3.24k stars 1.03k forks source link

[+] Implement JDBC read data module, including JDBCDataModel JDBCData… #333

Closed AntiTopQuark closed 4 years ago

AntiTopQuark commented 4 years ago

The main work:

  1. Fix the punctuation problem in librec/core/src/main/java/net/librec/data/convertor/TextDataConvertor.java, the default delimiter for csv format files is comma. But it is written as a space in a certain constructor:

https://github.com/guoguibing/librec/blob/84fc31b4abd5597112cf904235df5d7816438c64/core/src/main/java/net/librec/data/convertor/TextDataConvertor.java#L101-L102

  1. Implemented JDBCDataModel and JDBCDataConvertor. And realized a simple test. The database password was processed. The following is a table statement.
SET NAMES utf8mb4;
SET FOREIGN_KEY_CHECKS = 0;

DROP TABLE IF EXISTS `test`;
CREATE EXTERNAL TABLE `librec_jdbc_test`  (
  `usercol`string ,
  `itemcol` string ,
  `ratingcol` int,
  `datacol` string 
) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';

INSERT INTO `test`(`usercol`, `itemcol`, `ratingcol`, `datacol`) VALUES ('1', '1', 1, '1');
INSERT INTO `test`(`usercol`, `itemcol`, `ratingcol`, `datacol`) VALUES ('adas', 'cv', 2, '2');
INSERT INTO `test`(`usercol`, `itemcol`, `ratingcol`, `datacol`) VALUES ('asda', '1', 3, '3');
INSERT INTO `test`(`usercol`, `itemcol`, `ratingcol`, `datacol`) VALUES ('2', '2', 4, '4');
INSERT INTO `test`(`usercol`, `itemcol`, `ratingcol`, `datacol`) VALUES ('3', '3', 3, '5');