Refactor Project Structure for Model Modularity

Overview

Restructure the project to better modularize different machine learning models (Random Forest, SVM, Decision Tree) within the Titanic Survival Prediction project. Need to maintain a clear, organized, and scalable codebase that allows for easy navigation, addition of new models, and flexibility in execution. This restructuring will be a significant step towards maintaining a professional and scalable project.

Current Structure

The current structure has a single entry point (main.py) with all model-related code within the src/ directory. This setup, while functional, can become cluttered as I add more models or functionalities.

Proposed Structure

The proposed structure introduces subdirectories within src/ for each model and uses command-line arguments in main.py to run different models. This enhances clarity and maintains a single entry point for the project.

New Project Tree

kaggle_titanic/ │ ├── main.py # Entry point, handles command-line arguments ├── README.md ├── poetry.lock ├── pyproject.toml │ ├── data/ # Data files │ ├── models/ # Saved model files │ ├── outputs/ # Output visualizations, results │ ├── src/ # Source code │ ├── init.py │ ├── preprocess.py # Common preprocessing code │ ├── evaluate_model.py # Model evaluation code │ ├── features.py # Feature engineering │ │ │ ├── random_forest/ # RandomForest-specific code │ │ ├── init.py │ │ └── train.py │ │ │ ├── decision_tree/ # DecisionTree-specific code │ │ ├── init.py │ │ ├── decision_tree.py │ │ └── train.py │ │ │ └── svm/ # SVM-specific code │ ├── init.py │ └── train.py ├── submission.csv ├── tests/ └── user_passenger.py

Main.py Modification

main.py will use Python's argparse library to handle command-line arguments, allowing users to specify which model to run.

Benefits

Improved Clarity and Organization
Enhanced Scalability
User-friendly Execution
Clear Separation of Concerns

Tasks

[x] Create subdirectories within src for each model.
[x] Refactor main.py to include runner functions and command-line argument handling.
[ ] Update README.md and documentation to reflect the new project structure.
[x] Ensure existing functionalities are preserved and compatible with the new structure.

DeepBlockDeepak / kaggle_titanic