Open emilyjeng opened 1 year ago
@emilyjeng 您好,請問是否可以提供一下您運行的配置文件?
檔案連結如下: https://drive.google.com/drive/folders/10-2Q8zW3FC_hylvKXorfohL8w0sw1Dm-?usp=sharing
yaml檔設定如下:
field_separator: "\t" seq_separator: " " USER_ID_FIELD: user_id ITEM_ID_FIELD: item_id RATING_FIELD: rating TIME_FIELD: timestamp NEGPREFIX: neg LABEL_FIELD: label normalize_all: True #正規化 threshold: rating: 4 load_col: inter: [user_id, item_id, rating] kg: [head_id, relation_id, tail_id] link: [item_id, entity_id]
val_interval:
rating: "[4,inf)"
unused_col:
inter: [rating]
user_inter_num_interval: "[10,inf)" item_inter_num_interval: "[10,inf)"
embedding_size: 64 kg_embedding_size: 64 # (int) The embedding size of relations in knowledge graph. reg_weights: [1e-2,1e-2] # (list of float) The L2 regularization weights.
kg_reverse_r: True entity_kg_num_interval: "[5,inf)" relation_kg_num_interval: "[5,inf)"
epochs: 500 train_batch_size: 4096 eval_batch_size: 40960000 metrics: ['Recall', 'MRR', 'NDCG', 'Hit', 'Precision'] valid_metric: Hit@10 train_neg_sample_args: distribution: uniform sample_num: 1 dynamic: False
執行: python run_recbole.py --model=CKE --dataset=yelp22_us10shop --config_files=test.yaml
@emilyjeng 您好,請嘗試將normalize_all設置爲False
@chenyuwuxin 感謝解答!我將normalize_all設置爲False後,出現報錯如下:
02 Jan 08:41 INFO yelp22_us10shop
The number of users: 1
Average actions of users: nan
The number of items: 1
Average actions of items: nan
The number of inters: 0
The sparsity of the dataset: 100.0%
Remain Fields: ['entity_id', 'user_id', 'item_id', 'head_id', 'relation_id', 'tail_id', 'label']
The number of entities: 1
The number of relations: 2
The number of triples: 0
The number of items that have been linked to KG: 0
02 Jan 08:41 WARNING Field [rating] is not in [inter_feat], which can not be set in unused_col
.
Traceback (most recent call last):
File "run_recbole.py", line 48, in user_inter_num_interval
to filter those users.,
但如上面的設置,我有設置user_inter_num_interval,以及我發現user及item數量過少,請問是否我的數據集kg及link建立關聯的想法是否錯誤的?或是有其他的問題? 如下圖:
@emilyjeng 您好,這個問題是由於數據集中存在某個用戶或者物品的交互過少,導致它交互的對象全被過濾掉了。您可以嘗試降低user_inter_num_interval和item_inter_num_interval來解決這個問題。
@chenyuwuxin 您好!我後來發現是entity_kg_num_interval和relation_kg_num_interval的數量問題,當我降低後,產生了另一個錯誤
Traceback (most recent call last):
File "run_recbole.py", line 48, in
是否 entity_id不能是字串呢? 以下是我的link檔,item_id:token entity_id:token
您好!我嘗試簡單建立了yelp的知識圖譜,在.kg檔案中,我將head_id:token設為iten_id:token,relation_id:token設為location.shop.location,tail_id:token設為categories:token_seq 如下所示:
也增加另一個relation_id:token
在.link檔案中,item_id:token保持不變,entity_id:token設為categories:token_seq,如下所示:
但在我執行時會遇到錯誤,如下所示:
Traceback (most recent call last): File "run_recbole.py", line 48, in
run_recbole(
File "/Emily/RecBole-master/recbole/quick_start/quick_start.py", line 69, in run_recbole
dataset = create_dataset(config)
File "/Emily/RecBole-master/recbole/data/utils.py", line 70, in create_dataset
dataset = dataset_class(config)
File "/Emily/RecBole-master/recbole/data/dataset/kg_dataset.py", line 68, in init
super().init(config)
File "/Emily/RecBole-master/recbole/data/dataset/dataset.py", line 108, in init
self._from_scratch()
File "/Emily/RecBole-master/recbole/data/dataset/dataset.py", line 120, in _from_scratch
self._data_processing()
File "/Emily/RecBole-master/recbole/data/dataset/dataset.py", line 168, in _data_processing
self._normalize()
File "/Emily/RecBole-master/recbole/data/dataset/dataset.py", line 710, in _normalize
feat[field] = norm(feat[field].values)
File "/Emily/RecBole-master/recbole/data/dataset/dataset.py", line 698, in norm
mx, mn = max(arr), min(arr)
ValueError: max() arg is an empty sequence
我不理解該如何處理這問題?或是我在建立知識圖譜的想法有錯? 如需復現我可以提供資料