Summary Data Environment-Setup
This repository contains the code for a machine learning project that aims to detect spam emails using various techniques. The project consists of four Jupyter notebooks, and they should be run in the following order:
The original source of the dataset is from the Programming Languages Group's website of University of Waterloo. The dataset contains 75,419 emails delivered to a particular server between April 2007 and Jul 2007. Click here. The data is available in csv from here.
To run the notebooks, you need to set up two different environments, one for the baseline model and the other for the neural networks. Here's how you can create the environments:
Create a new virtual environment using conda:
conda create -n ensemble python=3.8 numpy pandas matplotlib seaborn statsmodels scikit-learn=0.24.1 jupyter jupyterlab
conda activate ensemble
For Windows users:
pip install xgboost==1.1.1 mlxtend
For Mac users:
conda install -c conda-forge xgboost=1.1.1 mlxtend
Optional - Creating a Jupyter kernel for this environment:
ipython kernel install --name "ensemble" --user
Create the new empty environment named 'deeplearning'.
conda create -n deeplearning python=3.8
Activate the new environment.
conda activate deeplearning
Install all the basic packages we'll need (including jupyter notebook and lab).
conda install numpy=1.19.2 pandas=1.3.5 matplotlib jupyter jupyterlab pydot pillow seaborn
Note: you may get an initial frozen solve warning, but wait it out and packages should get installed using a flexible solve. Mac Instructions: Install TensorFlow in this environment. Mac users should install Tensorflow 2.7.0:
conda install -c conda-forge tensorflow=2.7.0
Windows users should install Tensorflow 2.3.0:
conda install -c conda-forge tensorflow=2.3.0
Install some more packages that we'll need in the TensorFlow Lecture.
conda install scikit-learn=0.24.1 nltk
conda install -c conda-forge gensim=3.8.3
Install PyTorch and TorchVision.
conda install -c pytorch pytorch=1.5.1 torchvision=0.6.1
Optional. Creating a Jupyter kernel for this environment (needed if you don't have nb_conda_kernels installed in your base environment):
ipython kernel install --name "deeplearning" --user