quinngroup / dr1dl-pyspark

Dictionary Learning in PySpark
Apache License 2.0
1 stars 1 forks source link

Convert `op_VCTbyMTX` and `op_MTXbyVCT` to `np.dot` operations. #5

Closed magsol closed 8 years ago

magsol commented 8 years ago

The NumPy library has built-in array-array multiplication operations; let's use it.

MOJTABAFA commented 8 years ago

actually I had to remove my installation because of my system problem and again installed paython, and anaconda , when I'm checking in consul with " conda list-- " I have the "numpy.py" installed in my list but when I'm trying to run a program in sublime text 3 there is following error :

Traceback (most recent call last): File "C:\Users\Mojtaba Fazli\Desktop\test1.py", line 1, in import numpy as np ImportError: No module named 'numpy' [Finished in 0.4s]

magsol commented 8 years ago

My initial suspicion is that you've installed Python both through Anaconda and through Sublime Text, but only the one through Anaconda has numpy installed. When you run through the Python installed by Sublime, it doesn't find numpy.

On Thu, Nov 19, 2015 at 2:50 PM MOJTABAFA notifications@github.com wrote:

actually I had to remove my installation because of my system problem and again installed paython, and anaconda , when I'm checking in consul with " conda list-- " I have the "numpy.py" installed in my list but when I'm trying to run a program in sublime text 3 there is following error :

Traceback (most recent call last): File "C:\Users\Mojtaba Fazli\Desktop\test1.py", line 1, in import numpy as np ImportError: No module named 'numpy' [Finished in 0.4s]

— Reply to this email directly or view it on GitHub https://github.com/quinngroup/pyspark-dictlearning/issues/5#issuecomment-158174731 .

MOJTABAFA commented 8 years ago

Thanks ! that was fantastic ! you're really a magician in python !! now it's working !!!

MOJTABAFA commented 8 years ago

by the way, in program they have initialized 3 vectors as U_old , U_new and V with zero , I've done it with 1XT and 1xP matrices with following instructions :

U_old = np.zeros((1,T), dtype=np.float) U_new = np.zeros((1,T), dtype=np.float) V = np.zeros((1,P), dtype=np.float) is it correct ?

magsol commented 8 years ago

LGTM.

MOJTABAFA commented 8 years ago

Opt_VctbyMtx and opt_MtxbyVct are very easy and as a piece of cake to implement by python , we have 2 options : 1 : converting the function which already written by Dr.Liu's team as : def op_VCTbyMTX(mtx_input, vct_input, vct_result, I, J): for j in range(J): tmp1 = 0 for i in range(I): tmp1 = vct_input[i]*mtx_input[[i],[j]] + tmp1 vct_result[j] = tmp1 or 2 : using np.dot in just one instruction as : vct_result = np.dot(vct_input,mtx_input)

I already checked both of them . I think the 2nd one would be more efficient , since the np functions which are dealing with matrices are absolutely faster than normal lists and arrays in python. Am I right ?

magsol commented 8 years ago

Yes, definitely default to numpy methods--they invoke compiled C++ code in the background for additional performance benefits. Built-in Python lists are much less efficient than Numpy arrays.

iPhone'd

On Nov 21, 2015, at 10:45, MOJTABAFA notifications@github.com wrote:

Opt_VctbyMtx and opt_MtxbyVct are very easy and as a piece of cake to implement by python , we have 2 options : 1 : converting the function which already written by Dr.Liu's team as : def op_VCTbyMTX(mtx_input, vct_input, vct_result, I, J): for j in range(J): tmp1 = 0 for i in range(I): tmp1 = vct_input[i]*mtx_input[[i],[j]] + tmp1 vct_result[j] = tmp1 or 2 : using np.dot in just one instruction as : vct_result = np.dot(vct_input,mtx_input)

I already checked both of them . I think the 2nd one would be more efficient , since the np functions which are dealing with matrices are absolutely faster than normal lists and arrays in python. Am I right ?

— Reply to this email directly or view it on GitHub.

MOJTABAFA commented 8 years ago

Thanks , may I use np.multiply for Op_MtxbyVec as follows ? : u_new = np.multiply(S,v)

magsol commented 8 years ago

No, still use np.dot. It will generalize depending on the dimensions of the input arrays: for two vectors it gives the inner product; for vector-matrix or matrix-matrix it gives the full multiply.

iPhone'd

On Nov 21, 2015, at 11:54, MOJTABAFA notifications@github.com wrote:

Thanks , may I use np.multiply for Op_MtxbyVec as follows ? : u_new = np.multiply(S,v)

— Reply to this email directly or view it on GitHub.

MOJTABAFA commented 8 years ago

so both the mtxbyvct and vctbymtx can be coded as follows with np.dot ? : v = np.dot(S,u_old) u_new = np.dot(S,v)

magsol commented 8 years ago

Yes.

I'd highly recommend opening up an IPython console and testing that the code is doing what you expect. Even better, I'd recommend writing unit tests to ensure this behavior is what you expect.

On Sat, Nov 21, 2015 at 12:10 PM MOJTABAFA notifications@github.com wrote:

so both the mtxbyvct and vctbymtx can be coded as follows with np.dot ? : v = np.dot(S,u_old) u_new = np.dot(S,v)

— Reply to this email directly or view it on GitHub https://github.com/quinngroup/pyspark-dictlearning/issues/5#issuecomment-158662677 .

MOJTABAFA commented 8 years ago

Actually I did It, and both of them are working in different examples.

MOJTABAFA commented 8 years ago

Program for testing np.dot :

import random import numpy as np import math t= 20 p=10 RAND_MAX = 2457458 mtx_input = np.zeros((t,p), dtype=np.float) vct_input = np.zeros((1,t), dtype=np.float) vct_result = np.zeros((1,p), dtype=np.float) print ('Analyzing component ')#,(m+1),'...') count_row=5

mtx_input = np.ones(shape=(3,1)) vct_input = np.ones(shape=(3,3))

x1 = np.arange(9.0).reshape((3, 3)) x2 = np.arange(3.0) x3 = np.dot(x1, x2) print('x1: ',x1) print('x2: ',x2) print('x3: ',x3)

===== out put : Analyzing component x1: [[ 0. 1. 2.] [ 3. 4. 5.] [ 6. 7. 8.]] x2: [ 0. 1. 2.] x3: [ 5. 14. 23.] [Finished in 0.3s]

magsol commented 8 years ago

Try using the built-in unit testing framework for Python: https://docs.python.org/2/library/unittest.html

You can commit the tests under a new "tests" folder in the main repository.