Open asmirn1 opened 3 years ago
Progress:
Evaluating performance of models using mAP and AP:
TODO:
To apply YOLO to mzML data, we first need to convert that data into overlapping images and the corresponding labels.
TODO:
Centroided: 2016-03-15_EP03_D11_cell-E2-2 on the study link
Status:
TODO:
We tried YOLO on ~50 images and it seems to be working.
Next steps:
TODO:
TODO:
Current issues:
Next time, take another look at calculated m/z and retention time ranges.
https://colab.research.google.com/drive/1Zfc0K-rSsAA366ymqeoTUNEbx3bATYP8?usp=sharing
This is the YoLo notebook.
We have fixed all the bugs in the image generation and trained YOLO again. The precision 90.9%, recall 95.1%, mean average precision 94.8%. We've got about 450 peaks after processing 1/5 of the DCSM data file.
TODO:
adap_3d_main.py
. The input for this script should be a raw data file, work directory, and confidence threshold. Users should run it like (use package argparse
for this):
python adap_3d_main.py --file FILENAME --output CSV_FILENAME --work-directory FOLDER --confidence-threshold 25
The script should create intermediate data files in the work directory and output the peak table into the CSV file.
Current results: we processed "entire" DCMS raw data file.
One issue with the new algorithm is that it currently takes very long time: about 6 hours to create images from a raw data file, and about 1.5 hours for perform the prediction.
To speed it up, we need to avoid creating thousands of images and saving them on disk. In order to do that, there at least two options:
detect.py
and possible other classes to be able to read that object. The goal here is to create numpy arrays and feed them to YOLO without saving images on disk.@jerrychen04
Look into OpenCV for computer vision algorithms