RoboticsClubIITJ / ML-DL-implementation

An implementation of ML and DL algorithms from scratch in python using nothing but NumPy and Matplotlib.
BSD 3-Clause "New" or "Revised" License
48 stars 69 forks source link

Implement One hot encoding #72

Closed Player0109 closed 3 years ago

Player0109 commented 3 years ago

Implemented the One hot encoding class.

Fixes #41

FUNCTIONS in OneHotEncoder class are: (1) FIT(INPUT_X, THRESHOLD) --- It is used to calculate the number of unique values in each column and tell whether a particular column should be encoded or not. (2) CHECK_TRANSFORM(INPUT_X) --- It is used to check whether the data which is being transformed has the same values as the data which was used to fit it. (3) TRANSFORM(INPUT_X) --- It is used to OneHotEncode the data based on the data which was used to fit (4) FIT_TRANSFORM(INPUT_X, THRESHOLD) --- This function is just a combination of the fit and the transform function.

INPUTS in OneHotEncoder are: X - It is a NumPy array of size n x m. thresh - It is a threshold value which is calculated as THRESH = (NUMBER OF UNIQUE VALUES IN A COLUMN)/(LENGTH OF COLUMN). A column whose threshold value is below the input threshold value which encodes otherwise not.

VARIABLES in OneHotEncoder class are: ncols - It is used to store the number of columns in the fit data. arr_dic - Is is an array of dictionary, where each dictionary is the LabelEncoded value of a particular column. arr_nunique - It is an array which is used to store the number of unique values in a particular column. encode - It is an array of the size of the number of columns in the fit data. It has a value of 1 if the columns is to be encoded otherwise 0.

Player0109 commented 3 years ago

@rohansingh9001 Plz review.

rohansingh9001 commented 3 years ago

@Player0109 The code logic seems fine, however, the docstrings are written at the top of the class instead of the methods.

You can move those comments from the class section to their respective functions for ex:

Instead of

class SomeClass:
    # Some comments
    # Method 1 comments
    # Method 2 comments

    def method1():
         pass

    def method2():
        pass

You should write

class SomeClass:
    """
    Some comments about the class
    """

    def method1():
        """
         Method 1 comments
        """
         pass

    def method2():
        """
        Method 2 comments
        """

        pass

Notice the use of triple quotes """ Doctring Here""" for doctrings instead of simple python comments # Some Comment