mobilenet V3 / shufflenet V2 라이더 데이터로 훈련 / 평가하기

jwkanggist commented 4 years ago

목표: 용희님이 주시는 데이터를 가지고 mobilenet v3 / shufflenet v2를 훈련해서 평가한다

필요한 자료: @hayleyshim

훈련 데이터
평가 데이터

조사해야 할코드 (혹시 사용하시는 코드가 있으시면 남겨주세요 ) @yesolyun

mobilenet v3 훈련코드, 평가코드 (예솔님) @H-YURA
shufflenet v2 훈련 코드 평가코드 (유라님)

hayleyshim commented 4 years ago

용량이 큰(12GB) 관계로 구글드라이브 링크로 공유하겠습니다.

데이터 : https://drive.google.com/drive/u/0/folders/1O-GNNhk2_gCAKXa_kJoQzc6fuV502Q2K

참고 : https://github.com/BichenWuUCB/squeezeDet

해당 소스에 random_split_train_val.py 파일을 통해 data split을 할 수 있어보입니다.
해당 소스는 ImageNet classification 을 위한 pretrained CNN model을 지원합니다. (4 models: SqueezeDet, SqueezeDet+, VGG16+ConvDet, ResNet50+ConvDet)
해당 소스는 mobilenet v3 모델을 지원하지 않아 따로 찾아 적용해야합니다.

jwkanggist commented 4 years ago

네 용희님 데이터 감사합니다

1) 일단 데이터를 보고 input / output shape를 파악하세요
2) 각 모델의  아키텍쳐를 파악하시고 아키텍쳐를 테이블로 만들어 보세요 (아래 예제 이미지 참고)
3) 라이더 데이터에 맞추기 위해서 어떤 부분을 수정해야하는지 연구 해보세요
4) 각 모델 (mobilenet v3 / shufflenet v2)의 오픈소스 코드를 찾아서 클론 하시고 일단 있는 환경에서 돌려보세요
5) 3)에서 연구한데로 코드를 수정해 보세요
6) 라이더 데이터로 훈련을 돌려보세요
7)   평가 코드를 짜고 평가셋을 이용해서 훈련된 모델의 점수를 확인하세요

테이블 예제

혹시 위 과정을 진행하다가 막히거나 궁금한 점이 있으면 저한테 이슈로 코멘트를 남겨주세요 제가 퇴근하고는 매일매일 체크할께요

화이팅입니다. :-)

@yesolyun @H-YURA

hayleyshim commented 4 years ago

멘토님 코멘트를 보고 기존 데이셋보다 아래 링크에 나온 데이터셋으로 데이터의 input/output shape을 파악하기 쉬울 것 같아 공유합니다. 참고해주세요.

dataset input/ouput shape : http://yizhouwang.net/blog/2018/07/29/train-yolov2-kitti/

*링크내용 중

Download data and labels Download KITTI object 2D left color images of object data set (12 GB) and submit your email address to get the download link. Download training labels of object data set (5 MB). Unzip them to your customized directory and .

Why is KITTI difficult to train on YOLO? Many people tried to train YOLOv2 with KITTI dataset but often get really poor performance. This is a typical result of YOLOv2 detection without doing any modification. This is a YOLOv2 trained on 3 classes of KITTI dataset.

Why does YOLOv2 perform bad on KITTI unlike other datasets? After review the basic properties of KITTI, we can find that the shape of the images is really wide: 1224×370. However, the default input shape of YOLOv2 is 416×416. After this kind of resizing, the bbox of the object would because really thin, and probably result in the bad performance. Moreover, the sizes of the objects in KITTI could be various. Some of the objects could be too small to be detected.

yura1h commented 4 years ago

shufflenet model 정리

1) 논문에서 사용한 input data: ImageNET 224x224 2) complexity 낮아지는 것을 채널 수가 증가함으로써 보상 3) 같은 complexity 기준(예:140)으로, 채널 수(g=8) 많을 수록 error 감소

- 잘못 이해한 부분이 있다면 알려주세요 :)

kitti data 사용 시 고려해야 할 사항

1) 용희님이 올려주신 링크에서 나온 것처럼 우리가 사용하는 data는 1224*370 크기 고려해야 함 2) 우리가 model을 돌릴 target board가 shufflenet 지원을 안할 수도 있음-> 체크해봐야 함

Task

shufflenet으로 train 모델을 kitti data로 적용해서 학습(stage=8)한 "ShuffleDet" 코드 돌려볼 예정

참조: https://github.com/XJTUWYD/ShuffleDet

jwkanggist commented 4 years ago

사용하시는 데이터가 square 데이터가 아니네요 두가지 방법이 있습니다 1) input pipeline 앞에서 interpolation을해서 input shape을 맞춘다 (사용하시는 텐서플로 버전을 알려주시면 필요한 api를 찾아보죠 2) rectangular shape이 가능하도록 모델을 수정한다

1)번이 쉬운 방법입니담만 성능 열화가 있을 수 잇습니다 2)번 방법은 하나하나 모델을 수정해 나가야합니다. 그리곡 filter의 kernelsize도 조절해야겟죠

@H-YURA

jwkanggist commented 4 years ago

interporation은 tf.image의 resize 메소드들을 사용하실수 있어요

물론 버전 호환문제는 확인해주시길 바래요

https://www.tensorflow.org/api_docs/python/tf/compat/v1/image/resize_bicubic

jwkanggist commented 4 years ago

지금 kitti 데이터 로더를 보니깐 tf.image의 resize 메소드를 사용할 수 없을꺼 같네요

데이터 파이플라인이 numpy로 되어 있네요 cv2를 써야할 것같습니다

nnstreamer-preprocessor / nnstreamer