chrismintz1 / Public

0 stars 0 forks source link

Task 13 - Create a 2,500 word report on how we collaborated #15

Open chrismintz1 opened 4 weeks ago

chrismintz1 commented 3 weeks ago

Here's some notes to get this kicked off.

chrismintz1 commented 3 weeks ago

@antoniaagunbiade , I put some initial comments here as to how we collaborated.

chrismintz1 commented 3 weeks ago

Hey @antoniaagunbiade , I was noticing the section called "Model Comparison and Selection" and I guess that means we compare what we chose with what we could have used?

Could you do some research that shows the strengths and weaknesses of the models that we could have chose and then highlight why we went with XGBoost for Task 1? Compare it to logistic regression, random forests, or artificial neural networks.

For Task 2, we can use that article on PyTorch vs TensorFlow - Johns, Ray. (2024). PyTorch vs TensorFlow for your Python Deep Learning Project. Available at: https://realpython.com/pytorch-vs-tensorflow/. [Accessed Aug 5, 2024]

antoniaagunbiade commented 3 weeks ago

Hey, @chrismintz1 So I've had a look at the ask for the report on Canvas, it want us to compare the models developed for Task 1 & 2, the performance of them along with the results. So I've started talking about XGBoost in more detail, the performance and result. Then once Task 2 is complete, I'll do the same and then compare the accuracy score explaining how and why that model performed better. I've briefly mentioned the reasonings of choosing the XGBoost in Task 1 but this section will give us the chance to go into more depth.

chrismintz1 commented 3 weeks ago

OK @antoniaagunbiade , that makes sense. At this point we're committed to Tensorflow for our Project 2 approach so you can do the comparisons between a CNN and a gradient boosted decision tree. I learned an interesting thing about the terminology and that is that the model we are using for Project 2 is Tensorflow and Keras is just an API layer on top of it. In fact, Keras is layered on several models to make them more accessible including PyTorch.

chrismintz1 commented 3 weeks ago

One of our failed approaches at dealing with the fact our training data was in triples was to implement a Region Proposal Network (R-CNN) as part of our initial data pre-processing. This would identify the Regions Of Interest so we could feed them into the classification model but through the use of an existing Tensorflow SSD Mobilenet V2 model ("https://tfhub.dev/tensorflow/ssd_mobilenet_v2/2"), it not only identified the regions but also tried to provide a class for them. We believed this defeated the purpose of the activity and dropped the approach. @antoniaagunbiade

chrismintz1 commented 3 weeks ago

Results As part of our model we tuned the layers in the sequential CNN model which included adding dropout layers and expanding sequential Dense layers. Our final layer being a Dense layer with 10 nodes which represented our classes 0-9

The things we did with hyper-parameter tuning and cross validation with a TensorFlow model might include: https://www.tensorflow.org/decision_forests/tutorials/automatic_tuning_colab I'll be using this automatic tuning article

For visualizing learning curves with a TensorFlow model I'm going to do this: https://blog.finxter.com/5-best-ways-to-visualize-tensorflow-training-results-using-python/#:~:text=5%20Best%20Ways%20to%20Visualize%20TensorFlow%20Training%20Results,4%20Method%204%3A%20Plotly%20for%20Interactive%20Graphs%20 Including TensorBoard, MatplotLib (probably won't do many more than that) @antoniaagunbiade