nzc / dnn_ctr

The framework to deal with ctr problem。The project contains FNN,PNN,DEEPFM, NFM etc
756 stars 285 forks source link

请问有人用完整的数据集测试过吗 能跑到论文中给定的loss吗 #15

Closed EnnengYang closed 3 years ago

EnnengYang commented 5 years ago

我参考别的方式处理过一下数据集,用的完整的criteo数据集,但是跑不到论文中的精度。 有人跑过完整的吗?

ganyuqi commented 5 years ago

您好,请问你用的什么样的方式处理的数据集呢?可以给我参考一下吗

EnnengYang commented 5 years ago

我是参考百度PaddlePaddle深度学习框架(https://github.com/PaddlePaddle/models/blob/develop/PaddleRec/ctr/preprocess.py) 中对Criteo dataset的处理

ganyuqi commented 5 years ago

您跑出来精度大概多少呢?我只有0.781左右

EnnengYang commented 5 years ago

您跑出来精度大概多少呢?我只有0.781左右

是的 我跑的差不多就这个 到不了0.8

cici-tan commented 4 years ago

@YEN-GitHub 您好 请问下我的特征有50个 会有问题吗 我看作者好像有39个的限制 The model is deepfm(fm+deep layers) Init fm part Init fm part succeed Init deep part Init deep part succeed Init succeed pre_process data ing... pre_process data finished [1, 100] loss: 16.579094 metric: 0.992473 time: 5.7 s [1, 200] loss: 0.261742 metric: 0.993243 time: 5.6 s


[1] loss: 0.054187 metric: 0.998984 time: 16.1 s


Traceback (most recent call last): File "train.py", line 46, in val_dict['index'], val_dict['value'], val_dict['label'], ealry_stopping=True, refit=True, save_path='./checkpoints/') File "/Users/cicitan/Documents/dataservice_recommendation/model/DeepFM.py", line 400, in fit valid_loss, valid_eval = self.eval_by_batch(Xi_valid, Xv_valid, y_valid, x_valid_size) File "/Users/cicitan/Documents/dataservice_recommendation/model/DeepFM.py", line 490, in eval_by_batch outputs = model(batch_xi, batch_xv) File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(*input, *kwargs) File "/Users/cicitan/Documents/dataservice_recommendation/model/DeepFM.py", line 204, in forward fm_first_order_emb_arr = [(torch.sum(emb(Xi[:,i,:]),1).t()Xv[:,i]).t() for i, emb in enumerate(self.fm_first_order_embeddings)] File "/Users/cicitan/Documents/dataservice_recommendation/model/DeepFM.py", line 204, in fm_first_order_emb_arr = [(torch.sum(emb(Xi[:,i,:]),1).t()Xv[:,i]).t() for i, emb in enumerate(self.fm_first_order_embeddings)] File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(input, **kwargs) File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/sparse.py", line 118, in forward self.norm_type, self.scale_grad_by_freq, self.sparse) File "/usr/local/lib/python3.7/site-packages/torch/nn/functional.py", line 1454, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) RuntimeError: index out of range at /Users/administrator/nightlies/pytorch-1.0.0/wheel_build_dirs/wheel_3.7/pytorch/aten/src/TH/generic/THTensorEvenMoreMath.cpp:191

EnnengYang commented 4 years ago

@YEN-GitHub 您好 请问下我的特征有50个 会有问题吗 我看作者好像有39个的限制 The model is deepfm(fm+deep layers) Init fm part Init fm part succeed Init deep part Init deep part succeed Init succeed pre_process data ing... pre_process data finished [1, 100] loss: 16.579094 metric: 0.992473 time: 5.7 s [1, 200] loss: 0.261742 metric: 0.993243 time: 5.6 s

[1] loss: 0.054187 metric: 0.998984 time: 16.1 s

Traceback (most recent call last): File "train.py", line 46, in val_dict['index'], val_dict['value'], val_dict['label'], ealry_stopping=True, refit=True, save_path='./checkpoints/') File "/Users/cicitan/Documents/dataservice_recommendation/model/DeepFM.py", line 400, in fit valid_loss, valid_eval = self.eval_by_batch(Xi_valid, Xv_valid, y_valid, x_valid_size) File "/Users/cicitan/Documents/dataservice_recommendation/model/DeepFM.py", line 490, in eval_by_batch outputs = model(batch_xi, batch_xv) File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(*input, kwargs) File "/Users/cicitan/Documents/dataservice_recommendation/model/DeepFM.py", line 204, in forward fm_first_order_emb_arr = [(torch.sum(emb(Xi[:,i,:]),1).t()Xv[:,i]).t() for i, emb in enumerate(self.fm_first_order_embeddings)] File "/Users/cicitan/Documents/dataservice_recommendation/model/DeepFM.py", line 204, in fm_first_order_emb_arr = [(torch.sum(emb(Xi[:,i,:]),1).t()Xv[:,i]).t() for i, emb in enumerate(self.fm_first_order_embeddings)] File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call* result = self.forward(input, **kwargs) File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/sparse.py", line 118, in forward self.norm_type, self.scale_grad_by_freq, self.sparse) File "/usr/local/lib/python3.7/site-packages/torch/nn/functional.py", line 1454, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) RuntimeError: index out of range at /Users/administrator/nightlies/pytorch-1.0.0/wheel_build_dirs/wheel_3.7/pytorch/aten/src/TH/generic/THTensorEvenMoreMath.cpp:191

改一下preprocess.py中这些就可以:

There are 13 integer features and 26 categorical features

continous_features = range(1, 14) categorial_features = range(14, 40)

Clip integer features. The clip point for each integer feature

is derived from the 95% quantile of the total values in each feature

continous_clip = [20, 600, 100, 50, 64000, 500, 100, 50, 500, 10, 10, 10, 50]

cici-tan commented 4 years ago

@YEN-GitHub Thanks, I fixed it. But another problem bothered me, when I trained a new model and trying to load this model:

  File "train.py", line 138, in <module>
    deepfm.load_state_dict(torch.load(model_path))
  File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 769, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for DeepFM:
    size mismatch for fm_first_order_embeddings.0.weight: copying a param with shape torch.Size([16354, 1]) from checkpoint, the shape in current model is torch.Size([15080, 1]).
    size mismatch for fm_first_order_embeddings.5.weight: copying a param with shape torch.Size([743, 1]) from checkpoint, the shape in current model is torch.Size([503, 1]).
    size mismatch for fm_first_order_embeddings.6.weight: copying a param with shape torch.Size([5, 1]) from checkpoint, the shape in current model is torch.Size([4, 1]).
    size mismatch for fm_first_order_embeddings.8.weight: copying a param with shape torch.Size([35, 1]) from checkpoint, the shape in current model is torch.Size([33, 1]).
    size mismatch for fm_first_order_embeddings.9.weight: copying a param with shape torch.Size([113, 1]) from checkpoint, the shape in current model is torch.Size([83, 1]).
    size mismatch for fm_second_order_embeddings.0.weight: copying a param with shape torch.Size([16354, 4]) from checkpoint, the shape in current model is torch.Size([15080, 4]).
    size mismatch for fm_second_order_embeddings.5.weight: copying a param with shape torch.Size([743, 4]) from checkpoint, the shape in current model is torch.Size([503, 4]).
    size mismatch for fm_second_order_embeddings.6.weight: copying a param with shape torch.Size([5, 4]) from checkpoint, the shape in current model is torch.Size([4, 4]).
    size mismatch for fm_second_order_embeddings.8.weight: copying a param with shape torch.Size([35, 4]) from checkpoint, the shape in current model is torch.Size([33, 4]).
    size mismatch for fm_second_order_embeddings.9.weight: copying a param with shape torch.Size([113, 4]) from checkpoint, the shape in current model is torch.Size([83, 4]).
1980695671 commented 3 years ago

是把数据集整个加载到内存中吗?内存抗不了啊

EnnengYang commented 3 years ago

是把数据集整个加载到内存中吗?内存抗不了啊

我自己实验环境是CPU: 32, GPU: 1080Ti (11G)。这个环境下embedding维度设置为10我跑是没问题的