du-lab / Trace

2 stars 0 forks source link

Investigate object/boundary/edge detection ML algorithms #5

Open asmirn1 opened 3 years ago

asmirn1 commented 3 years ago

Look into OpenCV for computer vision algorithms

du-lab commented 3 years ago

Progress:

jerrychen04 commented 3 years ago

List of Algorithms:

jerrychen04 commented 3 years ago
jerrychen04 commented 3 years ago

Evaluating performance of models using mAP and AP:

asmirn1 commented 3 years ago

TODO:

To apply YOLO to mzML data, we first need to convert that data into overlapping images and the corresponding labels.

Whiteboard 4 -01

asmirn1 commented 3 years ago

TODO:

jerrychen04 commented 3 years ago

Profiled: https://drive.google.com/file/d/1bH_LAk3_2amxwpljrvoAU6GxyQZgAc3k/view?usp=sharing

jerrychen04 commented 3 years ago

Centroided: 2016-03-15_EP03_D11_cell-E2-2 on the study link

du-lab commented 3 years ago

Status:

TODO:

asmirn1 commented 3 years ago

We tried YOLO on ~50 images and it seems to be working.

Next steps:

asmirn1 commented 3 years ago

TODO:

asmirn1 commented 3 years ago

TODO:

asmirn1 commented 3 years ago

Current issues:

Next time, take another look at calculated m/z and retention time ranges.

jerrychen04 commented 3 years ago

https://www.quora.com/How-can-YOLO-compute-the-confidence-score-at-test-time-They-say-they-compute-it-as-P-object-IOU-But-during-test-time-you-dont-have-the-ground-truth-boxes-How-is-it-possible

du-lab commented 3 years ago

https://colab.research.google.com/drive/1Zfc0K-rSsAA366ymqeoTUNEbx3bATYP8?usp=sharing

This is the YoLo notebook.

asmirn1 commented 3 years ago

We have fixed all the bugs in the image generation and trained YOLO again. The precision 90.9%, recall 95.1%, mean average precision 94.8%. We've got about 450 peaks after processing 1/5 of the DCSM data file.

TODO:

asmirn1 commented 3 years ago

Current results: we processed "entire" DCMS raw data file.

One issue with the new algorithm is that it currently takes very long time: about 6 hours to create images from a raw data file, and about 1.5 hours for perform the prediction.

To speed it up, we need to avoid creating thousands of images and saving them on disk. In order to do that, there at least two options:

  1. Use parallel processing when generating images and writing them on disk.
  2. Instead of saving each individual image on disk, we need to create a Python object containing all the images (or numpy arrays) and save that object on disk instead. Then, we'll need to modify detect.py and possible other classes to be able to read that object. The goal here is to create numpy arrays and feed them to YOLO without saving images on disk.
  3. Write the YOLO neural network that works with our numpy arrays from scratch.
asmirn1 commented 3 years ago

@jerrychen04