shenweichen / DSIN

Code for the IJCAI'19 paper "Deep Session Interest Network for Click-Through Rate Prediction"
https://arxiv.org/abs/1905.06482
Apache License 2.0
433 stars 132 forks source link

train_dsin error #8

Closed blldd closed 4 years ago

blldd commented 5 years ago

Hi, I have got an error while run train_dsin.py, the info as follows:

Caused by op 'sparse_emb_14-brand/Gather_6', defined at: File "train_dsin.py", line 52, in att_embedding_size=1, bias_encoding=False) File "/home/dedong/pycharmProjects/Emb4RS/models/DSIN/code/_models/dsin.py", line 85, in DSIN sess_feature_list, sess_max_count, bias_encoding=bias_encoding) File "/home/dedong/pycharmProjects/Emb4RS/models/DSIN/code/_models/dsin.py", line 154, in sess_interest_division sparse_fg_list, sess_feture_list, sess_feture_list) File "/home/dedong/anaconda3/envs/tf1.4/lib/python3.6/site-packages/deepctr/input_embedding.py", line 145, in get_embedding_vec_list embedding_vec_list.append(embedding_dictfeat_name) File "/home/dedong/anaconda3/envs/tf1.4/lib/python3.6/site-packages/tensorflow/python/keras/_impl/keras/engine/topology.py", line 252, in call output = super(Layer, self).call(inputs, *kwargs) File "/home/dedong/anaconda3/envs/tf1.4/lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 575, in call outputs = self.call(inputs, args, **kwargs) File "/home/dedong/anaconda3/envs/tf1.4/lib/python3.6/site-packages/tensorflow/python/keras/_impl/keras/layers/embeddings.py", line 158, in call out = K.gather(self.embeddings, inputs) File "/home/dedong/anaconda3/envs/tf1.4/lib/python3.6/site-packages/tensorflow/python/keras/_impl/keras/backend.py", line 1351, in gather return array_ops.gather(reference, indices) File "/home/dedong/anaconda3/envs/tf1.4/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 2486, in gather params, indices, validate_indices=validate_indices, name=name) File "/home/dedong/anaconda3/envs/tf1.4/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 1834, in gather validate_indices=validate_indices, name=name) File "/home/dedong/anaconda3/envs/tf1.4/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/home/dedong/anaconda3/envs/tf1.4/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2956, in create_op op_def=op_def) File "/home/dedong/anaconda3/envs/tf1.4/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1470, in init self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): indices[0,0] = 136739 is not in [0, 79963) [[Node: sparse_emb_14-brand/Gather_6 = Gather[Tindices=DT_INT32, Tparams=DT_FLOAT, validate_indices=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](sparse_emb_14-brand/embeddings/read, sparse_emb_14-brand/Cast_6)]]

do you know how to fix this, thanks!

shenweichen commented 5 years ago

please update your code to the latest and run them on the environment written on https://github.com/shenweichen/DSIN#operating-environment

blldd commented 5 years ago

I run this code on tf-cpu1.4.0, cause my cuda is 10.0 and cannot run on gpu. Do you know what does this error info mean?

shenweichen commented 5 years ago

have you run your code on python3.6?

blldd commented 5 years ago

right

shenweichen commented 5 years ago

check your code is up to date with the the latest commit

blldd commented 5 years ago

It is the latest commit with deepctr==0.4.1

shenweichen commented 5 years ago

yes i suggest you to clone the whole repo and re-run again

blldd commented 5 years ago

ok thank you for your suggestion

18201788952 邮箱:18201788952@163.com

Signature is customized by Netease Mail Master

On 09/08/2019 17:26, 浅梦 wrote:

yes i suggest you to reclone the whole repo and re-run again

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

blldd commented 5 years ago

Hi, my friend, thank you for your work, I try to debug and find a tiny bug:

in file - 0_gen_sampled_data.py:

unique_cate_id = np.concatenate(
    (ad['cate_id'].unique(), log['cate'].unique()))

lbe.fit(unique_cate_id)

in file - 2_gen_dsin_input.py:

data = pd.merge(sample_sub, user, how='left', on='userid', )
data = pd.merge(data, ad, how='left', on='adgroup_id')

here merge method lost some data(cate_id, and brand)

sparse_feature_list = [SingleFeat(feat, data[feat].nunique(
) + 1) for feat in sparse_features + ['cate_id', 'brand']]

so here data['brand'].nunique() is small than input data index.

and I log all unique input brand number, and update the fd, then code can run without error.

MrDadiao commented 5 years ago

Hi, my friend, thank you for your work, I try to debug and find a tiny bug:

in file - 0_gen_sampled_data.py:

unique_cate_id = np.concatenate(
    (ad['cate_id'].unique(), log['cate'].unique()))

lbe.fit(unique_cate_id)

in file - 2_gen_dsin_input.py:

data = pd.merge(sample_sub, user, how='left', on='userid', )
data = pd.merge(data, ad, how='left', on='adgroup_id')

here merge method lost some data(cate_id, and brand)

sparse_feature_list = [SingleFeat(feat, data[feat].nunique(
) + 1) for feat in sparse_features + ['cate_id', 'brand']]

so here data['brand'].nunique() is small than input data index.

and I log all unique input brand number, and update the fd, then code can run without error.

I have also encountered this problem. Could you please tell me how to modify this bug in detail?

jellchou commented 5 years ago

Hi, my friend, thank you for your work, I try to debug and find a tiny bug:

in file - 0_gen_sampled_data.py:

unique_cate_id = np.concatenate(
    (ad['cate_id'].unique(), log['cate'].unique()))

lbe.fit(unique_cate_id)

in file - 2_gen_dsin_input.py:

data = pd.merge(sample_sub, user, how='left', on='userid', )
data = pd.merge(data, ad, how='left', on='adgroup_id')

here merge method lost some data(cate_id, and brand)

sparse_feature_list = [SingleFeat(feat, data[feat].nunique(
) + 1) for feat in sparse_features + ['cate_id', 'brand']]

so here data['brand'].nunique() is small than input data index.

and I log all unique input brand number, and update the fd, then code can run without error.

Hi, I met the same problem, could you tell us how to fix the bug?

blldd commented 5 years ago

Hi, my friend, thank you for your work, I try to debug and find a tiny bug:

in file - 0_gen_sampled_data.py:

unique_cate_id = np.concatenate(
    (ad['cate_id'].unique(), log['cate'].unique()))

lbe.fit(unique_cate_id)

in file - 2_gen_dsin_input.py:

data = pd.merge(sample_sub, user, how='left', on='userid', )
data = pd.merge(data, ad, how='left', on='adgroup_id')

here merge method lost some data(cate_id, and brand)

sparse_feature_list = [SingleFeat(feat, data[feat].nunique(
) + 1) for feat in sparse_features + ['cate_id', 'brand']]

so here data['brand'].nunique() is small than input data index.

and I log all unique input brand number, and update the fd, then code can run without error.

Hi, I met the same problem, could you tell us how to fix the bug?

Sorry for the late reply, I am not sure whether it is ok or not.

  1. log the dimension in file 0gen...:

    pd.to_pickle({ 'cate_id': SingleFeat('cate_id', len(np.unique(unique_cate_id)) + 1), 'brand': SingleFeat('brand', len(np.unique(unique_brand)) + 1), }, '../model_input/dsin_fd_catebrand' + str(FRAC) + '.pkl')

  2. update input fd in train_dsin.py:

    cate_brand_fd = pd.read_pickle('../model_input/dsin_fd_catebrand' + str(FRAC) + '.pkl')

    fd['sparse'][13] = cate_brand_fd['cate_id'] fd['sparse'][14] = cate_brand_fd['brand']

  3. rerun the script.

jellchou commented 5 years ago

Hi, my friend, thank you for your work, I try to debug and find a tiny bug:

in file - 0_gen_sampled_data.py:

unique_cate_id = np.concatenate(
    (ad['cate_id'].unique(), log['cate'].unique()))

lbe.fit(unique_cate_id)

in file - 2_gen_dsin_input.py:

data = pd.merge(sample_sub, user, how='left', on='userid', )
data = pd.merge(data, ad, how='left', on='adgroup_id')

here merge method lost some data(cate_id, and brand)

sparse_feature_list = [SingleFeat(feat, data[feat].nunique(
) + 1) for feat in sparse_features + ['cate_id', 'brand']]

so here data['brand'].nunique() is small than input data index.

and I log all unique input brand number, and update the fd, then code can run without error.

Hi, I met the same problem, could you tell us how to fix the bug?

Sorry for the late reply, I am not sure whether it is ok or not.

  1. log the dimension in file 0gen...: pd.to_pickle({ 'cate_id': SingleFeat('cate_id', len(np.unique(unique_cate_id)) + 1), 'brand': SingleFeat('brand', len(np.unique(unique_brand)) + 1), }, '../model_input/dsin_fd_catebrand' + str(FRAC) + '.pkl')
  2. update input fd in train_dsin.py: cate_brand_fd = pd.read_pickle('../model_input/dsin_fd_catebrand' + str(FRAC) + '.pkl') fd['sparse'][13] = cate_brand_fd['cate_id'] fd['sparse'][14] = cate_brand_fd['brand']
  3. rerun the script.

thank you too much, please let me try

shenweichen commented 4 years ago

sorry for this mistake, we are planning to refactor our code in the future. I think this error can be fixed by using

sparse_feature_list = [SingleFeat(feat, data[feat].max(
    ) + 1) for feat in sparse_features + ['cate_id', 'brand']]

instead of

https://github.com/shenweichen/DSIN/blob/3aed7819e47f0463f12ab78cc2589cacf1081745/code/2_gen_dsin_input.py#L141-L142