sbailey / empca

Principal Component Analysis (PCA) for Missing and/or Noisy Data
Other
77 stars 23 forks source link

missing data imputation #3

Open shyamkkhadka opened 7 years ago

shyamkkhadka commented 7 years ago

Hi, Can you please tell me how can I use for missing data imputation for this program ? It is written as

Missing data is simply the limit of weight=0.

But I am not getting where to set weight = 0. When I did m0 = empca(noisy_data, weights = 0, niter=20) it gives error as

File "empca.py", line 290, in empca assert data.shape == weights.shape

Can you please help me ? I want to use your code for imputation problem. Thank you.

sbailey commented 7 years ago

weights and data should both be arrays with shape (num_observations, num_variables). weights[i,j] indicates what weight should be applied to data[i,j] when calculating the PCA. If observation i variable j is missing, then set weights[i,j] = 0 and data[i,j] will be ignored. For each variable j, there must be some observations i that have a non-zero weight.