Open JKZuo opened 3 years ago
Hello, thanks for this question! In each data folder, we give three data files:
tensor.mat
is an M-by-I-by-J observation tensor;random_tensor.mat
is an M-by-I-by-J uniform distributed random tensor of range [0, 1];random_matrix.mat
is an M-by-I uniform distributed random matrix of range [0, 1].Of course, you can remove both random_tensor.mat
and random_matrix.mat
and use the following codes instead:
import numpy as np
# Specify tensor size
M = 214 # Suppose 214 road segments
I = 61 # Suppose 61 days
J = 144 # Suppose 144 time slots per day
# Generate random matrix of size M-by-I
np.random.seed(1000) # Set random seed
random_matrix = np.random.rand(M, I)
# Or generate random tensor of size M-by-I-by-J
np.random.seed(1000) # Set random seed
random_tensor = np.random.rand(M, I, J)
Hope it can help you!
Best, Xinyu
您好,我现在手上有一份数据集(传感器采集的数据,存在缺失值),想尝试用LRC-TNN来试试填充缺失值的效果,但跑出来结果似乎有点问题。 `import pandas as pd from tqdm import tqdm import time
r = 0.2 print('Missing rate = {}'.format(r)) missing_rate = r
file_path = '' data_19111201984=pd.read_csv(file_path,encoding='gbk') data_19111201984=data_19111201984[data_19111201984.day.isin([9,10,11,12,13,14])] data_list = []
for day, day_df in tqdm(data_19111201984.groupby('day')): data_list.append([day_df['温度'].values.tolist()])
dense_tensor = np.array([ten2mat(np.array(data_list), 2)]) print(dense_tensor.shape) # (1, 1440, 6) (sensor_id,num of data for one day,6 days) dim1, dim2, dim3 = dense_tensor.shape np.random.seed(1000) sparse_tensor = dense_tensor * np.round(np.random.rand(dim1, dim2, dim3) + 0.5 - missing_rate) print(sparse_tensor.shape) start = time.time() alpha = np.ones(3) / 3 rho = 1e-4 theta = 30 epsilon = 1e-4 maxiter = 100 LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter) end = time.time() print('Running time: %d seconds'%(end - start)) print()`
输出结果是: `Missing rate = 0.2 100%|██████████████████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 3009.55it/s] (1, 1440, 6) (1, 1440, 6) Total iteration: 2 Tolerance: 0.0 Imputation MAPE: 1.0 Imputation RMSE: 5.55775
Running time: 0 seconds`
Hello, thank you for this question! If your tensor data is of size 1-by-1440-by-6, this is really a matrix. Please consider a matrix completion model rather than tensor completion models.
Best regards, Xinyu
Hello, thank you for this question! If your tensor data is of size 1-by-1440-by-6, this is really a matrix. Please consider a matrix completion model rather than tensor completion models.
Best regards, Xinyu
Thank you for your answer. Now I only use the data collected by just one sensor, so my tensor data is of size 1-by-1440-by-6. Does that mean if I use data collected by n sensors and get the tensor data of size n-by-1440-by-6, then I can consider a tensor completion model. By the way, is there any matrix model recommended.
Yeah, you can consider tensor completion model, but in LRTC-TNN, theta
should be smaller than min{n, 1440, 6}.
About the data set. Each data file has these three named data: tensor,random_tensor,random_matrix. What do these three stand for and is there any difference?