kiranrawat / Detecting-Fake-News-On-Social-Media

Flask web application that aims to predict fake news over social media using NLP and Machine Learning.
5 stars 3 forks source link

Round 3 comments, suggestions and questions #4

Closed rfazeli closed 3 years ago

rfazeli commented 3 years ago

Great work, Kiran! You now have a solid project that you've completed from start to finish.

Here is my feedback so far:

General suggestions

  1. I suggest you look at the following style guide for improving the docstring of your functions and classes (if you have classes): https://google.github.io/styleguide/pyguide.html#s3.8.3-functions-and-methods. You have already included a description for each function which is great. You can also describe the function inputs and the outputs similar to the guide. This is just a suggestion. Do this only if you have extra time. Here is a simple example as well:

    def mul(x, y):
    """
    Multiplies two numbers together.
    
    Args:
        x (float): First number
        y (float): Second number
    
    Returns:
        Multiplication of two numbers
    """
    return x * y
  2. It'd be good to add something like the snippet below at the end of training.py to replicate the training procedure in your Training.ipynb notebook but only using the best performing model. So when you run python training.py it would load and process the data and train a new model and save it automatically.

    if __name__ == "__main__":
    # Replicate the training procedure in your notebook

Specific Comments

  1. The following line should be updated to from src.prediction import get_predictions otherwise it would throw an error. https://github.com/kiranrawat/Detecting-Fake-News-On-Social-Media/blob/51884bbefe341ea773cf038db9437d34793119f2/app.py#L11

  2. You can remove some of the packages you're not using in app.py. These are

    import pandas as pd
    from sklearn.feature_extraction.text import CountVectorizer
    from flask import url_for
  3. Instead of this line you can use something more generic that would work on any machine (This comment applies to your other notebooks as well) https://github.com/kiranrawat/Detecting-Fake-News-On-Social-Media/blob/51884bbefe341ea773cf038db9437d34793119f2/notebooks/Training.py#L37

For example:

import os
from pyprojroot import here
sys.path.append(os.path.join(here(), 'src'))

You just need to install the package that I've used: pip install pyprojroot

  1. It's better to use import cleaning instead so you'd know which function you're using from that module https://github.com/kiranrawat/Detecting-Fake-News-On-Social-Media/blob/51884bbefe341ea773cf038db9437d34793119f2/notebooks/data_cleaning.py#L17

Questions

  1. Is there a reason you're returning the input query in uppercase? https://github.com/kiranrawat/Detecting-Fake-News-On-Social-Media/blob/51884bbefe341ea773cf038db9437d34793119f2/app.py#L35
  2. It seems like you're importing process_text() but you're not using it anywhere. Why is that? I think you should be processing/cleaning your statements before converting them to features. https://github.com/kiranrawat/Detecting-Fake-News-On-Social-Media/blob/51884bbefe341ea773cf038db9437d34793119f2/notebooks/Training.py#L38

Also, I think you need to use process_text() in app.py as well to process/clean the input text.

kiranrawat commented 3 years ago

Thanks, Reza, for the great feedback. Here are my answers.

Questions 1. Ans: No, there isn't any specific reason for it. I am just displaying this query on the results.html page. But it's better to remove this and keep the original form.

Questions 2. Ans: Thanks for the catch. Don't know why I forgot this mandatory step. I'll add this.

kiranrawat commented 3 years ago

Added all the modifications including docstring. Pushed the code.