Add support for more hyperparameters

danellecline commented 1 year ago

This refers to the process and ecsprocess commands

Per the request in the google doc https://docs.google.com/spreadsheets/d/1YLM649lFmUq5k9pZVVy9aNK-Rf2YWD95iiTi1MWcznM/edit#gid=0 line item 11

When kicking off a job, be sure that all hyperparameters below are available and that they work. For example, ensure that we can turn on agnostic-nms, adjust conf, and iou-threshold. Here’s a list of flags we might want to be able to use/modify per run: parser.add_argument('--weights', nargs='+', type=str, default=ROOT / 'yolov5s.pt', help='model path or triton URL') parser.add_argument('--imgsz', '--img', '--img-size', nargs='+', type=int, default=[640], help='inference size h,w') parser.add_argument('--conf-thres', type=float, default=0.25, help='confidence threshold') parser.add_argument('--iou-thres', type=float, default=0.45, help='NMS IoU threshold') parser.add_argument('--max-det', type=int, default=1000, help='maximum detections per image') parser.add_argument('--classes', nargs='+', type=int, help='filter by class: --classes 0, or --classes 0 2 3') parser.add_argument('--agnostic-nms', action='store_true', help='class-agnostic NMS')

### Tasks
- [x] process command: Test weights (different models  in different s3 buckets)  @duane-edgington
- [x] process command: Test  iou-thres, conf-thres, imgsz  - @duane-edgington
- [x] process command: Test --agnostic-nms  - @duane-edgington
- [x] process command: Test classes parameter @duane-edgington
- [x] ecsprocess command: Test weights (different sized models in different clusters). Tested yolov5 640 and 1280.  Caveat - the --sav-vid exceeds the EBS elastic block storage allocation and it not supported  @duane-edgington
- [x] ecsprocess command:Test  iou-thres, conf-thres, imgsz  - @duane-edgington
- [x] ecsprocess command:Test --agnostic-nms  - @duane-edgington
- [x] ecsprocess command:Test classes parameter @duane-edgington
- [x] Add pytest for process command to execute upon check-in (dry-run mode) @danellecline
- [x] Add pytest for ecsprocess command to execute upon check-in (dry-run mode) @danellecline
- [x] Add pytest to run process in AWS and verify output (exclude in github actions) @danellecline
- [x] Add pytest to run ecsprocess in AWS and verify output (exclude in github actions) @danellecline

danellecline commented 1 year ago

@duane-edgington I'm adding support to pass the arguments in with a single option simply called --args. This simplifies the testing and potential errors. The new argument will be --args, e.g.

--args "--agnostic-nms --classes 0 2 3 --imgsz 1280 --iou-thres=0.1 --conf-thres=0.5"

So a test run will look like

deepsea-ai process \
--job "Ventana Dive 4263 with 0.5 conf iou .1 with new mbari315k model"
--input /Users/dcline/Dropbox/code/deepsea-ai/tests/data/
--input-s3 s3://902005-dev-test/
--output-s3 s3://902005-dev-test-out/
--instance-type ml.g4dn.xlarge
--config /Users/dcline/Dropbox/code/deepsea-ai/deepsea_ai/config/config.ini
--args "--agnostic-nms --classes 0 2 3 --imgsz 1280 -iou-thres=0.1 --conf-thres=0.5"

danellecline commented 1 year ago

@duane-edgington

We are testing on the branch called addhyperparameters in the 902005-vaa-dev account. I'm just about done with the process command changes, but more work to do to test the ecsprocess command.

To setup, you need to run with that branch in deepsea-ai, and update the docker image reference in your config.ini line

strongsort_ecr = mbari/strongsort-yolov5:addhyperparameters

duane-edgington commented 1 year ago

Here is my first successful test run

deepsea-ai process \
--job "Ventana Dive 4263 with 0.5 conf iou .1 with new mbari315k model" \
--input /Users/duane/deepsea-ai/deepsea_ai/tests/data/ \
--input-s3 s3://902005-dev-test/ \
--output-s3 s3://902005-dev-test-out/ \
--instance-type ml.g4dn.xlarge \
--args "--agnostic-nms --classes 0 2 3 --imgsz 1280 --iou-thres 0.1 --conf-thres 0.5"

I have a test file in /Users/duane/deepsea-ai/deepsea_ai/tests/data/ on the system from which I launch the job. V4361_20211006T163256Z_h265_1sec.mp4

I could use a test file in M3 with a command line similar to -i /Volumes/M3/master/i2MAP/2023/06/20230606/i2MAP_20230606T193622Z_1000m_F031_12.mov \

this assumes, of course, that I have mounted titan/m3 to my system before running a job that needs to access the /Volumes mount

danellecline commented 1 year ago

Testing weights

Several models are in the /Volumes/M3_ML directory to test.

The required format is to tar them up with their configuration in a file called custom_config.yaml in the same directory as the model, or a subdirectory called "1", for model version 1. If you download a public model from s3://902005-public/models/yolov5x_mbay_benthic_model.tar.gz

This is done automatically when trained with the deepsea-ai train command, but to test manually, you will need to package them up.

duane-edgington commented 1 year ago

I have run several tests with process command. I have exercised:

Select different model in s3: bucket (as .gz file)
With and without -agnostic-nms
With different image size (640 or 1280)
With specific classes
With different -iou-thres
With different -conf-thres
With different # maximum detections
With and without -save-vid

Here is another example script

deepsea-ai process \
--job "Ventana Dive 4263 with 0.1 conf iou .4 with default" \
--model-s3 s3://902005-public/models/yolov5x_mbay_benthic_model.tar.gz \
--input /Users/duane/deepsea-ai/deepsea_ai/tests/data/ \
--input-s3 s3://902005-dev-test/ \
--output-s3 s3://902005-dev-test-out/ \
--instance-type ml.g4dn.xlarge \
--args " --max-det 2 --save-vid --imgsz 640 --iou-thres 0.4 --conf-thres 0.1"

danellecline commented 1 year ago

@duane-edgington please do me a favor and add your tests to this file for reproducitbility:

https://github.com/mbari-org/deepsea-ai/blob/addhyperparameters/tests/test_process.py

these tests should be excluded during check-in for now, but can be run manually with pytest.

danellecline commented 1 year ago

@duane-edgington There are two clusters available for testing named yv5 (640-sized model) and Megadetector (1280-sized model) e.g. deepsea-ai ecsprocess --cluster yv5 . Please test the monitor command as well, e.g. deepsea-ai monitor for each cluster. I've tested most of this through pytest, but hand testing is welcomed.

I am working on the local-only processing FastAPI service now, as it is the highest priority in the video lab spreadsheet. That service uses the same code, packaged in either an arm or amd docker image that is used in the cloud workflow. We should have consistent results no matter where we process our data now, which is important.

I will leave this branch open while I roll that out in case any changes are needed.

danellecline commented 1 year ago

The behavior of deepsea-ai monitor --cluster yv5 has changed with this update. I replaced the simple JSON database with a SQLite database and now track the state of the video processing at a more granular level than before.

mbari-org / deepsea-ai

Add support for more hyperparameters #22

Testing weights