quinngroup / dr1dl-pyspark

Dictionary Learning in PySpark
Apache License 2.0
1 stars 1 forks source link

Python coding convention improvements #19

Closed magsol closed 8 years ago

magsol commented 8 years ago

There are many small syntactical fixes that need to be made for the code to be production-ready and open sourced. They include (but are not limited to):

import numpy as np
x = np.array([1, 2, 3,
    4, 5, 6, 7, 8, 9, 10],
    dtype = np.float) # Any continuation of one line should be indented
def square(x):
    """
    Returns the square (x^2) of the input argument.

    Parameters
    ---------------
    x : float
        The base value to be squared.

    Returns
    ----------
    x * x : float
        The square of the input argument.
    """
    return x * x
magsol commented 8 years ago

In function docstrings, the primitives types (integer, float, string, etc) are all being specified correctly. However, with the vector / matrix structures, we need to go a step further.

Rather than simply specify vct_input : vector, we need to provide the user with the type array (since that can imply vector or matrix), as well as the shape of the array. For example, a vector with length T would look like this:

vct_input : array, shape (T)

Similarly, a matrix with P rows and T columns would be documented like this:

mat_input : array, shape (P, T)

All parameters and return values that are arrays should have this form.

MOJTABAFA commented 8 years ago

Thanks for such a good point, I'll apply it right now.