shenweichen / DeepCTR

Easy-to-use,Modular and Extendible package of deep-learning based CTR models .
https://deepctr-doc.readthedocs.io/en/latest/index.html
Apache License 2.0
7.57k stars 2.21k forks source link

can't concat when embedding_size is set to "auto" #46

Closed dev-wei closed 4 years ago

dev-wei commented 5 years ago

Describe the bug(问题描述) When set the embedding size to "auto", the Concatenate layer can't merge all input Embedding with different size at axis=2

def concat_fun(inputs, axis=-1): if len(inputs) == 1: return inputs[0] else: return Concatenate(axis=axis)(inputs)

To Reproduce(复现步骤) Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. ValueError: A Concatenate layer requires inputs with matching shapes except for the concat axis. Got inputs shapes: [(None, 1, 36), (None, 1, 30), (None, 1, 6), (None, 1, 12), (None, 1, 12), (None, 1, 30), (None, 1, 12)]

Operating environment(运行环境):

Additional context Add any other context about the problem here.

shenweichen commented 5 years ago

can you concat the emb list with axis=-1?

dev-wei commented 5 years ago

@shenweichen thanks for the quick response.

do you mean change "fm_input = concat_fun(deep_emb_list, axis=1)" to "fm_input = concat_fun(deep_emb_list, axis=-1)"

?

shenweichen commented 5 years ago

the auto embedding mode is not supported when you use fm based model.

dev-wei commented 5 years ago

that sounds clear. Perhaps, we should add a validation error out there.

Also, I want to gain your opinion on "what is the right embedding size, when dataset has a number of sparse fields".

shenweichen commented 5 years ago

Thanks for your advice

Usually I search for a size in 4 8 12 16 32 64 etc

dev-wei commented 5 years ago

Thanks for the sharing. I will do extra tuning over there.

Currently, I am evaluate movie-len 100k with LR, the mse I got, based on epoch=10, embedding size=8 is not amazing.

the train mse is 0.7317, and test mse is 0.9476 - for nfm model.

Do you have some metrics you could share?

I'd continue do more.

dev-wei commented 5 years ago

btw, I love this repo and your quickness on responding. Good work.

shenweichen commented 5 years ago

Sorry I don’t have any metrics on movie-len ~