philusnarh / pneumonia_project

0 stars 0 forks source link

Task 1: Exploratory Data Analysis & Preprocessing #1

Open philusnarh opened 5 years ago

philusnarh commented 5 years ago

Project Description:

image Figure 1: Lung anatomy and gas exchange (Reproduced from Kaggle)

The top images in Figure 1 show how air passes through the bronchi and enter the tiny air sacs (aveoli) of the lungs where gas exchange occurs. Here, Oxygen diffuses from the alveoli into the capillaries, which carry it out of the lungs and to the rest of the body; Carbon dioxide then diffuses into the alveoli and is then exhaled out of the body. The bottom images display how the aveoli walls get thickened and filled with fluid and blood cells when pneumonia is detected. A patient with pneumonia will have the following symptoms: Fever (not always), Elevation of the white blood cells, Infiltration on the Chest X-rays, etc.

This disease can range in seriousness from mild to fatal in infants and in people older than 65 years. Universally, pneumonia kills ≈ 1 million people every year and the majority being kids (< 5 years)

chest x-ray is the best test for pneumonia diagnosis. However, reading x-ray images can be tricky and requires domain expertise and experience. It would, therefore, be convenient to train a computer to read these images and provide us with the results. However, reading x-ray images can be tricky and requires domain expertise and experience. It would, therefore, be convenient to train a computer to read these images and provide us with the results. This research proposes the use of supervised learning approach (an Artificial Intelligence (AI) algorithm) to train and analyze Chest x-ray images and then try to detect pneumonia. Here, we will investigate machine learning classifiers such as LR, MLP, SVM, KNN, PCA/TNSE, RF, GNB, etc to see which best identifies an anomaly in a chest x-ray.

philusnarh commented 5 years ago

TODOs:

  1. Practice the code in the link: https://nbviewer.jupyter.org/urls/dl.dropbox.com/s/wz6mai4z9sfbo64/pneumonia_detection_aims_research.ipynb

  2. Re-label the target variables [0, 0, 1] --->> [0, 1, 2] in both the train and test data.

  3. Extract all data in the train set that have target labels

  4. Flatten this data and store them row-wise in .csv file together with their target labels eg. for image 0 with a target 1 imo * **** ** 1 im1 2 im2 0 im3 ***** 0 etc

  5. Generate a pie chart showing the percentage of Gender in the train data.

  6. Create a stacked bar chart showing the type of x-ray for the gender

  7. Histogram plot showing the distribution of gender's age with pneumonia