Open ghost opened 3 years ago
Hi,
Thanks for your question. These are simply hardcoded lengths of the training, validation, and testing sets, i.e., how many samples each of those has (note that the test number is slightly different from the train/val, as it only contains each user and item once, as nothing more is needed for computing the necessary hash codes for evaluation). If you decide to use new datasets, or change the existing ones, these should be updated: One way to get the counts is as follows (train,val,test lengths):
print(sum(1 for _ in tf.python_io.tf_record_iterator(trainfiles[0])))
print(sum(1 for _ in tf.python_io.tf_record_iterator(valfiles[0])))
print(sum(1 for _ in tf.python_io.tf_record_iterator(testfiles[0])))
I hope this answers your question.
Thank you for your reply. It's very helpful for me, and the code could work effectively. But I have another question about the preloaded_testsamples = pickle.load(open(args["dataset"] + "_testdata.pkl","rb"))
. I can't find code to generate this test file. I noticed the testdata.pkl is like this:
preloaded_testsamples[user]:
[[2782, 4],
[3836, 5],
[7798, 5],
[8689, 5],
[9133, 5],
[11613, 5]]
so i guess it's composed of preloaded_testsamples[user] = [item,ratings],and i generate preloaded_testsamples on whole dataset like this:
datamatlab = loadmat('../ratings_contentaware_full.mat')
full_matrix = datamatlab["full_matrix"]
full_matrix = full_matrix.todense()
finnal_test_data = []
for i in tqdm(range(full_matrix.shape[0])):
preload_test_data=[]
for j in range(full_matrix[i].shape[1]):
if (full_matrix[i,j] != 0):
pretest = [j,full_matrix[i,j]]
preload_test_data.append(pretest)
finnal_test_data.append(preload_test_data)
pickle.dump(finnal_test_data, open('./test_data.pkl', "wb"))
Is this right? Or is there another to get this file? Thanks!
Thanks for reaching out. That seems to be one way to do it yes, but alternatively you can use the provided .tfrecord files (or extract another format based on those).
Thanks for the wonderful work. I downloaded the code and data of your paper through the link you gave me. But what the train_samples should be?
such as the code as follows : elif args["dataset"].lower() == "amacold": # 50p train_samples = 831866 val_samples = 148062 test_samples = 73857 max_rating = 5.0
how could i set these parameters? And if I set a wrong parameters, there comes a problem:
Caused by op 'add_3', defined at: File "main.py", line 536, in
main()
File "main.py", line 414, in main
is_training, args, max_rating, anneal_val, anneal_val_vae, batch_placeholder)
File "/mnt/data0/home/rocket_diggers_2/tuijian/acm_mm/NeuHash-CF/code/model.py", line 270, in make_network
total_loss, ham_dist_i1, reconloss = make_total_loss(i1_org_m, i1r, i1_sampling, sigma_anneal)
File "/mnt/data0/home/rocket_diggers_2/tuijian/acm_mm/NeuHash-CF/code/model.py", line 221, in make_total_loss
i1r = i1r + e0anneal
File "/mnt/data0/home/rocket_diggers_1/anaconda3/envs/tuijian/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 866, in binary_op_wrapper
return func(x, y, name=name)
File "/mnt/data0/home/rocket_diggers_1/anaconda3/envs/tuijian/lib/python3.6/site-packages/tensorflow/python/ops/gen_math_ops.py", line 301, in add
"Add", x=x, y=y, name=name)
File "/mnt/data0/home/rocket_diggers_1/anaconda3/envs/tuijian/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/mnt/data0/home/rocket_diggers_1/anaconda3/envs/tuijian/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(args, **kwargs)
File "/mnt/data0/home/rocket_diggers_1/anaconda3/envs/tuijian/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
op_def=op_def)
File "/mnt/data0/home/rocket_diggers_1/anaconda3/envs/tuijian/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1770, in init
self._traceback = tf_stack.extract_stack()
InvalidArgumentError (see above for traceback): Incompatible shapes: [1000] vs. [82000] [[node add_3 (defined at /mnt/data0/home/rocket_diggers_2/tuijian/acm_mm/NeuHash-CF/code/model.py:221) = Add[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](IteratorGetNext/_103, mul_9)]] [[{{node add_8/_113}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_222_add_8", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]