miladrezazadeh / twitter_depression_detection

This app is able to detect depressive characteristics in a given tweets and the classify it into depressive and Non-depressive.
MIT License
7 stars 0 forks source link

Code Review - Round 1 #1

Closed rfazeli closed 2 years ago

rfazeli commented 2 years ago

Overview

Great work so far, especially the explanations you have in your Emotional_Intensity.ipynb and Depressive-tweet-search.ipynb notebooks are very helpful. Generally, adding explanations for each section in a notebook makes it very easy to follow and shows your theory knowledge. It will also be very easy to gather these notes later and turn them into an article (if we wanted to write about this project and publish it)

Feedback

  1. Usually, you don't add your data to git. Instead, you can add a link (e.g. Google Drive) for downloading the data or add a little script for downloading the data from its source. But in this case, I think we can skip this as the data is relatively small.
  2. You should put all your data in a data/ folder and your notebooks in a notebooks/ folder. Later we will refactor some of the code in your notebooks into Python scripts and you can add the scripts in a scripts/ or src/ folder.
  3. If there is any information on the external datasets, you can create a README.md file under that folder (e.g. Emotional_Intensity) and the info to it.
  4. Use a virtual environment and create a requirements.txt file
  5. Create a .gitignore file and add .DS_Store and other files or folders you don't want to add to git.
  6. Use a consistent naming convention for your file and folder names. Using all lowercase letters separated by _ is very common. Also, make sure your .ipynb files and their corresponding .py files have the same name.

So your folder structure would look like this:

├── data
│   ├── processed
│   │   ├── ...
│   └── raw
│   │   ├── external
│   │   ├── scraped
├── notebooks
│   ├── data_gathering_twint.ipynb
│   ├── data_gathering_twint.py
│   ├── ...
├── src
│   ├── ...
├── .gitiginore
├── README.md
├── requirements.txt

Questions

  1. The .csv files under Data_Scrapping_API/ seem to be empty. Why is that?
  2. The general_tweets.csv file seems to be an html file, not csv. Right?
rfazeli commented 2 years ago

You can also change the project name to twitter_depression_detection to stay consistent with naming conventions.

rfazeli commented 2 years ago

You can also clean up your notebooks a bit. For example, you can use the -qqq when doing pip install to avoid the huge output.

miladrezazadeh commented 2 years ago

Thanks Reza for your comments! I have tried to implement them in the following order:

  1. I can address them to Googledrive if our dataset will be larger later on.
  2. I have changed the structure of the repo, it should look better now as we are adding more to it. 3.Emotional Intensity folder has been deleted since it is not the focus of our work.
  3. I will use virtual env at the end to create the whole script with .py as we discussed in the meeting.
  4. .gitignore has been added to the repo.
  5. I have changed the name of the files that were not consistent.
rfazeli commented 2 years ago

Awesome! Great work!

rfazeli commented 2 years ago

Feel free to close this issue once you think you've addressed all comments.