Issue deepctr_torch model requires at least 1 sparse feature

johanneskruse commented 3 years ago

Issue when there are no sparse features in the input, e.g.

import pandas as pd 
from deepctr_torch.models import NFM
from deepctr_torch.inputs import SparseFeat, DenseFeat

====================================================================
WITH SPARSE FEATURES: 

data = {"sparse1" : [1,2], "sparse2" : [2,3], "dense1" : [0.2,0.5], "dense2" : [0.1, 3.2]}
df = pd.DataFrame(data)

sparse_embeddings = [SparseFeat(feat, vocabulary_size=3, embedding_dim=5) for feat in ["sparse1", "sparse2"]] # most if models needs fixed embeddings size 
dense_embeddings  = [DenseFeat(feat, 1,) for feat in ["dense1", "dense2"]]

wide_input = sparse_embeddings + dense_embeddings
deep_input = sparse_embeddings + dense_embeddings

NFM(linear_feature_columns=wide_input, dnn_feature_columns=deep_input)

NFM(
  (embedding_dict): ModuleDict(
    (sparse1): Embedding(3, 5)
    (sparse2): Embedding(3, 5)
  )
  (linear_model): Linear(
    (embedding_dict): ModuleDict(
      (sparse1): Embedding(3, 1)
      (sparse2): Embedding(3, 1)
    )
  )
  (out): PredictionLayer()
  (dnn): DNN(
    (dropout): Dropout(p=0, inplace=False)
    (linears): ModuleList(
      (0): Linear(in_features=7, out_features=128, bias=True)
      (1): Linear(in_features=128, out_features=128, bias=True)
    )
    (activation_layers): ModuleList(
      (0): ReLU(inplace=True)
      (1): ReLU(inplace=True)
    )
  )
  (dnn_linear): Linear(in_features=128, out_features=1, bias=False)
  (bi_pooling): BiInteractionPooling()
)

====================================================================

WITHOUT SPARSE FEATURES:

sparse_embeddings = [SparseFeat(feat, vocabulary_size=3, embedding_dim=5) for feat in []] # most if models needs fixed embeddings size 
dense_embeddings  = [DenseFeat(feat, 1,) for feat in ["dense1", "dense2"]]

wide_input = sparse_embeddings + dense_embeddings
deep_input = sparse_embeddings + dense_embeddings

NFM(linear_feature_columns=wide_input, dnn_feature_columns=deep_input)

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/zhome/63/4/108196/miniconda3/envs/rec/lib/python3.7/site-packages/deepctr_torch/models/nfm.py", line 45, in __init__
    self.dnn = DNN(self.compute_input_dim(dnn_feature_columns, include_sparse=False) + self.embedding_size,
  File "/zhome/63/4/108196/miniconda3/envs/rec/lib/python3.7/site-packages/deepctr_torch/models/basemodel.py", line 511, in embedding_size
    return list(embedding_size_set)[0]
IndexError: list index out of range

====================================================================

I have looked into modifying the "return list(embedding_size_set)[0]" in line 511 when there are no sparse features, however, if sparse_features = [], the output in print(embedding_size) = set().

I am not sure what consequences and how it affects the model later on forcing it to continue. The model does work if without dense features:

====================================================================

sparse_embeddings = [SparseFeat(feat, vocabulary_size=3, embedding_dim=5) for feat in ["sparse1", "sparse2"]]
dense_embeddings  = [DenseFeat(feat, 1,) for feat in []]

wide_input = sparse_embeddings + dense_embeddings
deep_input = sparse_embeddings + dense_embeddings

NFM(linear_feature_columns=wide_input, dnn_feature_columns=deep_input)

NFM(
  (embedding_dict): ModuleDict(
    (sparse1): Embedding(3, 5)
    (sparse2): Embedding(3, 5)
  )
  (linear_model): Linear(
    (embedding_dict): ModuleDict(
      (sparse1): Embedding(3, 1)
      (sparse2): Embedding(3, 1)
    )
  )
  (out): PredictionLayer()
  (dnn): DNN(
    (dropout): Dropout(p=0, inplace=False)
    (linears): ModuleList(
      (0): Linear(in_features=5, out_features=128, bias=True)
      (1): Linear(in_features=128, out_features=128, bias=True)
    )
    (activation_layers): ModuleList(
      (0): ReLU(inplace=True)
      (1): ReLU(inplace=True)
    )
  )
  (dnn_linear): Linear(in_features=128, out_features=1, bias=False)
  (bi_pooling): BiInteractionPooling()
)

====================================================================

Alternatively you can generate a dummy column (ones) for the model, however, this is a hack and I don't think a good solution. Is it possible to make a solution for the "must-have-sparse-feature-problem"?

Thank you for you time and awesome work!!

Best regards, Johannes

zanshuxun commented 3 years ago

Hi, Johannes, If you don't have a sparse feature, why do you use NFM?

johanneskruse commented 3 years ago

Hi,

I was experimenting with the variable (sparse/dense) impact on performance. The issue is part of the basemodel, thus, all models that rely on this e.g. NFM, DeepFM, DCN, etc.

To my understanding the models mentioned do not require feature engineering, i.e. they share the same input "linear_feature_columns" and "dnn_feature_columns". Thus, shouldn't the models be able to run without the sparse features, just like it can without dense features?

Futhermore, I tried to insert:

        if embedding_size_set == set():
            return 0
        else:
            return list(embedding_size_set)[0]

To avoid the issue.

This then creates the following issue:

linear_logit = torch.zeros([X.shape[0], 1]).to(sparse_embedding_list[0].device)
IndexError: list index out of range

However, looking at the code in basemodel

        linear_logit = torch.zeros([X.shape[0], 1]).to(sparse_embedding_list[0].device)
        if len(sparse_embedding_list) > 0:
            sparse_embedding_cat = torch.cat(sparse_embedding_list, dim=-1)
            if sparse_feat_refine_weight is not None:
                # w_{x,i}=m_{x,i} * w_i (in IFM and DIFM)
                sparse_embedding_cat = sparse_embedding_cat * sparse_feat_refine_weight.unsqueeze(1)
            sparse_feat_logit = torch.sum(sparse_embedding_cat, dim=-1, keepdim=False)
            linear_logit += sparse_feat_logit
        if len(dense_value_list) > 0:
            dense_value_logit = torch.cat(
                dense_value_list, dim=-1).matmul(self.weight)
            linear_logit += dense_value_logit

It seems that it is intended to handle no sparse features:

if len(sparse_embedding_list) > 0

as this is allowed to be False.

I hope may questions can be followed.

zanshuxun commented 3 years ago

Thus, shouldn't the models be able to run without the sparse features, just like it can without dense features?

Take DeepFM and NFM for example, if you don't have a sparse feature, how do you derive the embedding vectors which are used in FM part?

shenweichen / DeepCTR-Torch

Issue deepctr_torch model requires at least 1 sparse feature #201