Closed Vrindagupta6828 closed 4 years ago
I want to work on it @HarshCasper
Hello @Vrindagupta6828
Thanks for raising this issue. We want to specifically know what sort of Preprocessing you want to work on. Will you be making Scripts for that or will you be showcasing a Jupyter Notebook to show how Preprocessing is done.
Hello @harshcasper I will be showcasing a jupyter notebook to show how preprocessing is done befor applying any ml model.
I would like to have suggestions from @VijayaGB98 and @ricardoprins here about these Issues.
Well, this is such a complex topic, and rich of possibilities, that I find it highly unlikely that this can be contained in a small file.
@Vrindagupta6828 how familiar are you with ML basic concepts? Would you be interested in helping us in another Tesseract Coding project related to this topic?
@Vrindagupta6828
Data preprocessing is dependent on the data and the type of data used. How will you incorporate all the possibilities in a single jupyter notebook. Also data preprocessing is also domain dependent. An example I can give is for histopathology images, stain normalization is applied depending whether or not the dataset is evenly stained.
Also the preprocessing will depend on what ml model you are using. Feature scaling is important in say K-NN and Neural Networks, but not required in say Decision Trees. Additionally, this may change depending on type of task (Regression/Classification) and whether regularization is used.
There are too many variables too be accounted for if you want to make a single Jupyter Notebook. My suggestion would be to allow contribution to previously existing notebooks where individuals can add sections for data preprocessing if it is not already available.
@ricardoprins i can
@VijayaGB98 ok i get it.
Algorithm: Data preprocessing in ml
DS:
Languages Supported: Python