ingenieroariel / thesis

My masters thesis
1 stars 2 forks source link

lasso_admm is too slow #1

Open ingenieroariel opened 10 years ago

ingenieroariel commented 10 years ago

Profile it and make it faster. Right now a simple example like te following takes more than 20 seconds.

import numpy as np
from lasso import lasso_admm, lasso_cost, init_dictionary
from lasso import dict_learning as admm_dict_learning

rng_global = np.random.RandomState(0)
n_samples, n_features = 10, 8
X = rng_global.randn(n_samples, n_features)

n_components = 6
rng = np.random.RandomState(0)

code2, dictionary2, errors2 = admm_dict_learning(X, method='admm',
                                              n_components=n_components,
                                              alpha=1, random_state=rng)
ingenieroariel commented 10 years ago

Here is more profiling info.

x@arielo ~/thesis/notes/admm $ time python profile.py 

real    0m19.845s
user    0m25.288s
sys 0m51.872s
x@arielo ~/thesis/notes/admm $ kernprof.py -l -v profile.py 
Wrote profile results to profile.py.lprof
Timer unit: 1e-06 s

File: lasso.py
Function: lasso_admm at line 27
Total time: 29.614 s

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    27                                           @profile
    28                                           def lasso_admm(X, D, gamma=1, C=None, double=False, max_rho=5.0, rho=1e-4, max_iter=500):
    29                                               """
    30                                               Finds the best sparse code for a given dictionary for
    31                                               approximating the data matrix X by solving::
    32                                           
    33                                                   (U^*,) = argmin 0.5 || X - U V ||_2^2 + gamma * || U ||_1
    34                                                                (U)
    35                                           
    36                                               where V is the dictionary and U is the sparse code.
    37                                               """
    38                                               # Cast to matrix if input is an array, for when A or X are simple arrays/lists/vectors
    39       120          194      1.6      0.0      for item in [D, X]:
    40                                           
    41        80          137      1.7      0.0          if len(item.shape) < 2:
    42                                                       raise ValueError("Expected a matrix, found an array instead with shape %s" % item.shape)
    43                                           
    44        40           62      1.6      0.0      c = X.shape[1]
    45        40           56      1.4      0.0      r = D.shape[1]
    46                                           
    47        40          137      3.4      0.0      L = np.zeros((r, c))
    48        40         4346    108.7      0.0      I = sp.eye(r)
    49                                           
    50                                               #TODO: Assert C has the right shape
    51                                           
    52                                               # Initialize C with zeros if it is not passed
    53        40           74      1.9      0.0      if C is None:
    54        40          121      3.0      0.0          C = np.zeros((r, c))
    55                                           
    56        40           66      1.6      0.0      fast_sthresh = lambda x, th: np.sign(x) * np.maximum(np.abs(x) - th, 0)
    57                                           
    58        40           60      1.5      0.0      cost = []
    59                                           
    60     20000        31020      1.6      0.1      for n in range(1, max_iter):
    61                                                   #import ipdb;ipdb.set_trace()
    62                                                   # Define terms for sub-problem
    63     19960     14308592    716.9     48.3          F = np.dot(D.T, D) + np.dot(rho, I)
    64     19960       381777     19.1      1.3          G = np.dot(D.T, X) + np.dot(rho,C) - L
    65                                           
    66     19960      4451202    223.0     15.0          B,resid,rank,s = np.linalg.lstsq(F,G)
    67                                           
    68                                                   # Verify B is a solution to: F dot B = G
    69     19960      8508443    426.3     28.7          np.testing.assert_array_almost_equal(np.dot(F, B), G)
    70                                           
    71                                                   # Solve sub-problem to solve C
    72     19960       579057     29.0      2.0          C = fast_sthresh(B + L/rho, gamma/rho)
    73                                               
    74                                                   # Update the Lagrangian
    75     19960       242701     12.2      0.8          L = L + np.dot(rho, B - C)
    76                                               
    77     19960        52642      2.6      0.2          rho = min(max_rho, rho*1.1)
    78                                               
    79                                                   # get the current cost
    80     19960      1010833     50.6      3.4          current_cost = lasso_cost(X, D, B, gamma)
    81     19960        42467      2.1      0.1          cost.append(current_cost)
    82                                               
    83        40           53      1.3      0.0      return B, cost