Round 3 comments, suggestions and questions

rfazeli commented 3 years ago

Great work, Kiran! You now have a solid project that you've completed from start to finish.

Here is my feedback so far:

General suggestions

I suggest you look at the following style guide for improving the docstring of your functions and classes (if you have classes): https://google.github.io/styleguide/pyguide.html#s3.8.3-functions-and-methods. You have already included a description for each function which is great. You can also describe the function inputs and the outputs similar to the guide. This is just a suggestion. Do this only if you have extra time. Here is a simple example as well:
```
def mul(x, y):
"""
Multiplies two numbers together.

Args:
    x (float): First number
    y (float): Second number

Returns:
    Multiplication of two numbers
"""
return x * y
```
It'd be good to add something like the snippet below at the end of training.py to replicate the training procedure in your Training.ipynb notebook but only using the best performing model. So when you run python training.py it would load and process the data and train a new model and save it automatically.
```
if __name__ == "__main__":
# Replicate the training procedure in your notebook
```

Specific Comments

The following line should be updated to from src.prediction import get_predictions otherwise it would throw an error. https://github.com/kiranrawat/Detecting-Fake-News-On-Social-Media/blob/51884bbefe341ea773cf038db9437d34793119f2/app.py#L11

You can remove some of the packages you're not using in app.py. These are

import pandas as pd
from sklearn.feature_extraction.text import CountVectorizer
from flask import url_for

Instead of this line you can use something more generic that would work on any machine (This comment applies to your other notebooks as well) https://github.com/kiranrawat/Detecting-Fake-News-On-Social-Media/blob/51884bbefe341ea773cf038db9437d34793119f2/notebooks/Training.py#L37

For example:

import os
from pyprojroot import here
sys.path.append(os.path.join(here(), 'src'))

You just need to install the package that I've used: pip install pyprojroot

It's better to use import cleaning instead so you'd know which function you're using from that module https://github.com/kiranrawat/Detecting-Fake-News-On-Social-Media/blob/51884bbefe341ea773cf038db9437d34793119f2/notebooks/data_cleaning.py#L17

Questions

Is there a reason you're returning the input query in uppercase? https://github.com/kiranrawat/Detecting-Fake-News-On-Social-Media/blob/51884bbefe341ea773cf038db9437d34793119f2/app.py#L35
It seems like you're importing process_text() but you're not using it anywhere. Why is that? I think you should be processing/cleaning your statements before converting them to features. https://github.com/kiranrawat/Detecting-Fake-News-On-Social-Media/blob/51884bbefe341ea773cf038db9437d34793119f2/notebooks/Training.py#L38

Also, I think you need to use process_text() in app.py as well to process/clean the input text.

kiranrawat commented 3 years ago

Thanks, Reza, for the great feedback. Here are my answers.

Questions 1. Ans: No, there isn't any specific reason for it. I am just displaying this query on the results.html page. But it's better to remove this and keep the original form.

Questions 2. Ans: Thanks for the catch. Don't know why I forgot this mandatory step. I'll add this.

kiranrawat commented 3 years ago

Added all the modifications including docstring. Pushed the code.

kiranrawat / Detecting-Fake-News-On-Social-Media

Round 3 comments, suggestions and questions #4