mlsanigeria / AI-Hacktober-MLSA

Contributing to cutting-edge open-source projects in Machine Learning hosted by MLSA Nigeria
https://naija-translator.streamlit.app
MIT License
33 stars 67 forks source link

📸 Image Collection for Project 1 - Building Type Classification #49

Open Sammybams opened 11 months ago

Sammybams commented 11 months ago

Project Overview

In Project 1, we are working on creating a robust image classification system that can accurately identify different types of buildings, including:

Our goal is to build a machine learning model that excels at classifying these buildings based on images contributed by our community of contributors.

What's Needed

We need your help in collecting images of these building types to train and test our classification model. Specifically, we are looking for images that meet the following criteria:

How You Can Contribute

  1. Capture Images: Take clear photos of the building types (bungalows, storey buildings, and high-rise buildings) in portrait format.

  2. Image Naming: Please name your images uniquely. You can also choose to indicate the type of building and any relevant information about the image location if available.

  3. Contribution: Navigate to the data folder for project 1 and upload your images under the folder with the right building type. A minimum of 10 images in total is what will be required of you.

  4. Quality Check: Ensure that the images are of good quality and clearly depict the building type.

Example Image Names:

Let's Build an Accurate Building Classifier Together!

Your contributions will play a crucial role in training our machine learning model to accurately classify buildings. By sharing images of these building types, you're helping us create a more inclusive and effective solution.

Thank you for your valuable contributions! 🏙️📸

Note: If you have any questions or need assistance with image uploads, feel free to ask in the comments. Let's make this project a success!

ayoni02 commented 11 months ago

Hello, while asking us to take pictures is a good way to gather data, won't it be much easier to mine data from sites like Google or DuckGo? Or even take it from housing sites like Jiji.ng

here is an example https://www.realestatedatabase.net/FindAHouse/Houselist.aspx?RentSale=Sale+Price&HouseCategory=1&Title=Bungalow+for+sale#RED256

also, can I use other ml tools to train the data. e.g FastAi models? here is an example

Sammybams commented 11 months ago

Thank you @ayoni02. While this method might be a faster solution, your model can only be as good as your data. These pictures are usually heavily watermarked like the ones on Jiji. Many of the pictures on Google would have clean(while) backgrounds which is not a true depiction of what these images would actually be when testing with images taken with a phone or any other camera device. This would in most cases cause a data drift.

Also, the essence of this repo is to enable contributors to make pull requests for the simplest of things to get them started with Open-source. This can be worked on as an extra feature to scrape data, but currently, the main procedure for data collection is to collect organically taken images of buildings to create a model that can be applied to a real-world scenario.

ayoni02 commented 11 months ago

How about if I find an image that is clean enough, but wasn't taken by me? Do you know if I can submit this?

Also, you didn't answer my second question on me using fastai instead of Keras

Sammybams commented 11 months ago

Thank you @ayoni02. It is totally fine if they are clean enough. We also want to maintain an aspect ratio of 3:4. We can't train a good model with images of different aspect ratios. Even if we did, it could involve some advanced techniques to do so which isn't beginner-friendly like what we planned.

So what you can do is, if you find an image you think is clean enough, check the pixel size to see if it is 3:4. If it is, then you can make that contribution. Thank you.

As regards using Fastai, it is welcome. However, the goal of this project (Image classification) is to use pure ml techniques with no under-the-hood APIs.

Dantochi commented 11 months ago

Can I work on this issue?

Sammybams commented 11 months ago

Can I work on this issue?

Yes, definitely @Dantochi. Thank you very much.

michaelbabajide commented 11 months ago

while we are uploading images, i noticed some of the images are not named correctly. is it okay if i names correctly the ones that are not

Sammybams commented 11 months ago

while we are uploading images, i noticed some of the images are not named correctly. is it okay if i names correctly the ones that are not

Oh okay. Actually there is no strict naming convention for the images. They should just be unique so your pull request won't have conflicts with images already in the repo with same name.

So there is no need really for that. Thank you very much @ayoni02.

michaelbabajide commented 11 months ago

alright

anushka9555 commented 11 months ago

Hii @Sammybams can I work on this issue?

ayoni02 commented 11 months ago

Yes, I think anyone can

ayoni02

On Sat, Oct 14, 2023, 7:40 AM Anushka Rai @.***> wrote:

Hii @Sammybams https://github.com/Sammybams can I work on this issue?

— Reply to this email directly, view it on GitHub https://github.com/mlsanigeria/AI-Hacktober-MLSA/issues/49#issuecomment-1762666713, or unsubscribe https://github.com/notifications/unsubscribe-auth/AZC7C3BMUAHIP6LCAQRCF5TX7IXXNANCNFSM6AAAAAA5OUHFGI . You are receiving this because you were mentioned.Message ID: @.***>

Sammybams commented 11 months ago

Thank you @ayoni02. Yes @anushka9555, anyone can work on it, so you can.

Captain-Tee01 commented 11 months ago

Hi @Sammybams l will like to work on this issue..

Sammybams commented 11 months ago

Hello @Captain-Tee01, this issue is completely open for contributions so you can go ahead and work on it.