insightfulaistrategiesofficial / Medical-Diagnosis-Assistance

Create AI tools to assist doctors in diagnosing diseases by analyzing medical images and patient data. Features include image analysis, patient data integration, predictive insights, and seamless integration with hospital systems, enhancing diagnostic accuracy and efficiency.
Apache License 2.0
3 stars 3 forks source link

Reviewing and updating current web scraping scripts and API integrations. Attempting to identify new data sources. #2

Open rishidur opened 2 weeks ago

rishidur commented 2 weeks ago

Reviewed Medical Image Datasets:

  1. NIH Clinical Center:

    Chest X-Ray Dataset (ChestX-ray8): Contains over 100,000 frontal-view X-ray images of 30,805 unique patients with 14 disease labels. Link - (https://www.nih.gov/)

  2. The Cancer Imaging Archive (TCIA):

    A large archive of medical images of cancer, accessible for public download. Includes CT, MRI, and other types of images. Link - (https://www.cancerimagingarchive.net/)

  3. LIDC-IDRI (Lung Image Database Consortium and Image Database Resource Initiative):

    Contains thoracic CT scans with marked-up annotated lesions. Link - (https://www.cancerimagingarchive.net/collection/lidc-idri/)

  4. MIMIC-CXR (MIMIC Chest X-ray Database):

    A large dataset of de-identified chest radiographs from the Beth Israel Deaconess Medical Center. Link

  5. OASIS (Open Access Series of Imaging Studies):

    Provides MRI data for various brain imaging studies, including Alzheimer's disease. Link - (https://www.oasis-brains.org/)

UTSAVS26 commented 2 weeks ago

@rishidur Thanks for compiling this list of medical image datasets! These sources look promising for enhancing our data pool. I suggest we consider the following steps to integrate these datasets:

  1. Evaluate Data Accessibility: Check the access policies for each dataset to ensure we can use the data within our project constraints.
  2. Assess Data Compatibility: Confirm that the data formats and types are compatible with our current systems and processes.
  3. Prioritize Integration: Based on our immediate needs, prioritize which datasets to integrate first. For example, the ChestX-ray8 and MIMIC-CXR datasets could be valuable for expanding our chest X-ray data collection.
  4. Update Scripts: Modify our existing scraping scripts and API integrations to include these new data sources, ensuring that we handle data ingestion and processing appropriately.