abhisheks008 / DL-Simplified

Deep Learning Simplified is an Open-source repository, containing beginner to advance level deep learning projects for the contributors, who are willing to start their journey in Deep Learning. Devfolio URL, https://devfolio.co/projects/deep-learning-simplified-f013
https://quine.sh/repo/abhisheks008-DL-Simplified-499023976
MIT License
355 stars 298 forks source link

Website Classification #627

Closed manishh12 closed 4 months ago

manishh12 commented 4 months ago

Pull Request for DL-Simplified πŸ’‘

Issue Title : Website Classification

Closes: #606

Describe the add-ons or changes you've made πŸ“ƒ

The aim is to classify URLs into predefined categories such as adult content, arts, business, computers, games, health, home, kids, news, recreation, reference, science, shopping, society, or sports. The project involves preprocessing the dataset, visualizing the distribution of categories, training the models, and evaluating their performance.

  1. Implemented the CNN model architecture, including embedding, convolutional, max pooling, flatten, dropout, and dense layers.
  2. Trained the CNN model using the provided dataset and evaluated its performance.
  3. Plotted the training and validation loss values as well as the training and validation accuracy values for the CNN model.
  4. Defined the BiLSTM model architecture, consisting of embedding, bidirectional LSTM, and dense layers.
  5. Compiled and trained the BiLSTM model on the dataset.
  6. Generated plots showing the accuracy and loss curves for the BiLSTM model.

The modifications include adding detailed descriptions of the CNN and BiLSTM model architectures, training the models, evaluating their performance, and visualizing the training progress through loss and accuracy plots. Additionally, explanations were provided for training the models for fewer epochs due to resource constraints, which may affect the achieved accuracy.

Due to resource constraints, including limited GPU allocation, frequent runtime disconnects across multiple accounts, and the substantial size of the dataset, I could only implement two models: CNN and BiLSTM. Additionally, I was only able to train these models for a limited number of epochs.

Type of change β˜‘οΈ

What sort of change have you made:

Checklist: β˜‘οΈ

github-actions[bot] commented 4 months ago

Our team will soon review your PR. Thanks @manishh12 :)