transfer learning - Githubissues

hyunas1996 commented 3 years ago

hello thanks for your yolact code

I am trying to trasnfer learning with my custom dataset and I have two questions

1) I want to freeze layers except the last layer and retrain the last layer with my custom data. And your explanation about training in markdown means this?

2) If you mean what I mean, then what is the start-iter's role?

I will be glad if you answer my questions. Thanks : - )

zhangyingbit commented 3 years ago

please see the issue https://github.com/dbolya/yolact/issues/334, hope it can help you : )

hyunas1996 commented 3 years ago

Oh thank u so much! Then as following what you recommend on #334, I want to train the last layer only with my custom 300 data. How long do I need to train?

zhangyingbit commented 3 years ago

I think the training time depends on your GPU, hahaha~ My training data is about 2000 images, I use the resnet101 backbone, 200 epoch is ok for my situation~

hyunas1996 commented 3 years ago

Thanks for your kind answers. This is my mAP calculating result from training with my own data. I think my annotation format is correct, but I cannot find out the reason why the training is not go on.

I started the train with this command.

python train.py --config=yolact_plus_SUNRGBD_config --resume=weights/yolact_plus_resnet50_54_800000.pth --start_iter=0 --batch_size=5

hyunas1996 commented 3 years ago

Actually I added

for p in net.parameters():
        print("freeze parameters")
        p.requires_grad = False

these three lines to train.py before net.train(), cuz I thought these lines will stop backpropagation. And the result from this train.py is above.

So I erase these three lines from train.py and tried again after fix yolact.py line to p = pred_layer(pred_x.detach()).

then the mAP is like this.

There are two warnings when I start the training. First one is about the size mismatch. I think this is because I am using Yolact++ and I am ignoring it. Second one is

/home/hyunaseo/yolact2/utils/augmentations.py:309: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
  mode = random.choice(self.sample_options)

this one. Do I need to consider this warning?

Is this right? I'm sorry for too many questions. ㅠㅠ

zhangyingbit commented 3 years ago

@hyunas1996 Since I can't see your code, I guess there are two possible reasons for your problems:

1.Try to comment the three lines code, I think the three lines code will make all parameters no longer propagate back，but I am not sure, please try and check the loss~

for p in net.parameters(): print("freeze parameters") p.requires_grad = False

2.please make sure the ground truth about your data is right~

I also encountered the first warning you mentioned during training，it doesn't matter~ If you have any questions, please feel free to ask~

hyunas1996 commented 3 years ago

I comment the three lines of code and the result is this.

I wonder why the mask mAP is zero ㅠㅠ.

kiyoungKim6649 commented 3 years ago

저도 지금 mAP값이 작게 나와서 이 부분이 고민인데 해결 하셨나요? 다른 곳을 참조하니 mAP값이 90이상은 나와야 하는 것 같던데요..

hyunas1996 commented 3 years ago

저도 지금 mAP값이 작게 나와서 이 부분이 고민인데 해결 하셨나요? 다른 곳을 참조하니 mAP값이 90이상은 나와야 하는 것 같던데요..

아뇨 아직 해결 못 했습니다 ㅜㅜ (해결 하게 되면 말씀 드릴게용) 제가 지금 custom data로 쓰는 양이 300장 밖에 안 돼서 이런 문제가 발생하나 싶기도 합니다 ㅜㅜ 혹시 지금 mAP나 dataset size는 어느정도 되세요?

kiyoungKim6649 commented 3 years ago

저는 labelme를 이용해서 500장으로 만들어줬고 coco dataset 1개로 통합시켰어요. 이렇게 하니 mAP값이 all쪽에서 한 20까지는 나오더라구요. training시키실 때 어떤 명령어 사용해주셨나요?? 저는 아래 명령어 사용했습니다. 쉽지 않네요..ㅋㅋ python ./train.py --config=yolact_resnet50_smartfarm_config

hyunas1996 commented 3 years ago

저는 labelme를 이용해서 500장으로 만들어줬고 coco dataset 1개로 통합시켰어요. 이렇게 하니 mAP값이 all쪽에서 한 20까지는 나오더라구요. training시키실 때 어떤 명령어 사용해주셨나요?? 저는 아래 명령어 사용했습니다. 쉽지 않네요..ㅋㅋ python ./train.py --config=yolact_resnet50_smartfarm_config

저는 아래 명령어를 사용했습니당. nohup python train.py --config=yolact_plus_SUNRGBD_config --resume=weights/yolact_plus_resnet50_54_800000.pth --start_iter=0 &

흠흠 저는 시간이 지나면 box mAP는 30까지도 올라가긴 하던데, mask mAP가 변동 없이 계속 0이에요 ㅠㅜㅜㅜ

말씀해주신 mAP는 box인가요 mask인가요? 혹시 iter는 몇번까지 돌려보셨어용? 보통 하루정도 냅두기도 한다더라구용 ㅎㅎ

kiyoungKim6649 commented 3 years ago

제 경우에는 box가 30까지 나왔고 mask는 20정도까지 나왔어요. 근데 시간 지나도 변동도 없어서 여기 있는 mAP정보들 다 찾고있는 중인데 뭔가 뚜렷하게 저랑 비슷한 문제있는 사람들이 없는 듯 싶더라구요. 저는 주말에 컴퓨터 켜놓고 3일정도 돌렸어서 iter 꽤 큰 숫자였던 걸로 기억해요. 정말 말씀하신 것 처럼 data 숫자가 적어서 그런거면 data숫자만 늘리면 되니까 큰 문제가 안 될 거 같은데 어떤게 문제인지 잘 모르겠네요. ㅜㅜ..ㅋㅋ

hyunas1996 commented 3 years ago

제 경우에는 box가 30까지 나왔고 mask는 20정도까지 나왔어요. 근데 시간 지나도 변동도 없어서 여기 있는 mAP정보들 다 찾고있는 중인데 뭔가 뚜렸하게 저랑 비슷한 문제있는 사람들이 없는 듯 싶더라구요. 저는 주말에 컴퓨터 켜놓고 3일정도 돌렸어서 iter 꽤 큰 숫자였던 걸로 기억해요. 정말 말씀하신 것 처럼 data 숫자가 적어서 그런거면 data숫자만 늘리면 되니까 큰 문제가 안 될 거 같은데 어떤게 문제인지 잘 모르겠네요. ㅜㅜ..ㅋㅋ

헛 3일이나용? ㅋㅋㅋㅋ 처음에는 저처럼 0이셨나요? data 수가 적으면 확실히 mAP 늘리기가 한계가 있긴 한거 같더라구요. 저는 일단 indoor instance에만 적용하고 싶어서 Retrain 중인거라, 지금 800개로 data를 늘려보긴 했거든요! 좀 더 돌려보고 300개일 때에 비해 눈에 띄는 차이는 없겠지만, 좀 더 낫다 싶으면 또 comment 남겨볼게요!

혹시 뭔가 해결책을 찾게 되시면, 알려주세요 : -)

kiyoungKim6649 commented 3 years ago

아 그렇구나! data수가 많아야지 올리기가 더 용이한 측면이 있나보네요. 저도 뭔가 알게 된다면 여기에다가 다시 글 남기겠습니다! 아 그리고 저도 처음에는 0 가까운 값이었는데 한 10분인가? 지나게 되니까 0.5이상부터 값이 10이상씩 나오더라구요. 근데 글을 다시 찬찬히 보니 yolact++쓰시는 거 같은데, 저는 그냥 yolact라 좀 다를 수도 있지 않을까 싶기는 하네요..ㅎㅎ;; 아무튼 해결책을 알게되면 여기에 다시 글 남기겠습니다.

kiyoungKim6649 commented 3 years ago

mAP값이 다시 보니 저런식으로 나왔었네요. 0.9랑 0.95일 때 값이 거의 0인데 0.5일 때 값이랑 차이가 많이나는데 혹시 하실 때 저랑 비슷하게 값이 나오던가요? 0.5가 더 높은건 어느정도 이해가 가는데 0.9랑 0.95가 거의 값이 0에 가까워서 이것 또한 고민이네요.

hyunas1996 commented 3 years ago

mAP값이 다시 보니 저런식으로 나왔었네요. 0.9랑 0.95일 때 값이 거의 0인데 0.5일 때 값이랑 차이가 많이나는데 혹시 하실 때 저랑 비슷하게 값이 나오던가요? 0.5가 더 높은건 어느정도 이해가 가는데 0.9랑 0.95가 거의 값이 0에 가까워서 이것 또한 고민이네요.

오홍 그르게요 ㅜㅜ 저는 일단 길게 트레인 시켜보질 않아서 좀 더 지켜보긴 해야할 거 같은데 제 loss 중에서 M 부분이 너무 값이 커서 찾아보니 mask 관련 annotation에 문제가 있을거라 해서 그부분을 전 좀 더 살펴봐랴할 거 같습니당 기영님 loss는 정상 범주에 들어오는걸로 보아서 ㅎㅎ data 규모의 문제이지 않을까 싶어욤! (확실하진 않지만?) ㅎㅎ

kiyoungKim6649 commented 3 years ago

앗..감사합니다! 하시다가 궁금한 거 있으면 제가 도와드릴 수 있는 부분은 도와드릴께요!! rldud6649@naver.com 제 메일이니 하시다 문제 생기면 여기로 알려주세요 ㅋㅋ

udkii commented 2 years ago

안녕하세요! 혹시 transfer learning 관련하여 여쭤봐도 될까요?

다름이 아니라, yolact의 모든 layer를 freeze하지 않고 돌리고 싶습니다. 혹시 처음 질문하셨던 마지막 layer 제외 모두 freeze하고 돌리는 것은 어떻게 해결하셨는지 알려주실 수 있으신가요? 레이어의 일부를 자유롭게 얼리거나, 아예 scratch 부터 학습시키는 방법을 알고 계시는지 여쭙고 싶습니다ㅠㅠㅠ

급하여 옛날 질문글에 댓글 달아 죄송합니다. 도와주신다면 감사하겠습니다!

kiyoungKim6649 commented 2 years ago

안녕하세요! 혹시 transfer learning 관련하여 여쭤봐도 될까요?

다름이 아니라, yolact의 모든 layer를 freeze하지 않고 돌리고 싶습니다. 혹시 처음 질문하셨던 마지막 layer 제외 모두 freeze하고 돌리는 것은 어떻게 해결하셨는지 알려주실 수 있으신가요? 레이어의 일부를 자유롭게 얼리거나, 아예 scratch 부터 학습시키는 방법을 알고 계시는지 여쭙고 싶습니다ㅠㅠㅠ

급하여 옛날 질문글에 댓글 달아 죄송합니다. 도와주신다면 감사하겠습니다!

안녕하세요.

위에 현아님 메일입니다. hyunas1996@gmail.com

이전에 yolact 관련된 것 질문 드린다고 연락을 드린지 꽤 됐는데 아직도 현업에 계신지 모르겠네요.

저는 딥러닝 손 놓은지 좀 돼서 도움을 못 드리겠네요.

도움이 되셨기를 바래요~

udkii commented 2 years ago

안녕하세요! 혹시 transfer learning 관련하여 여쭤봐도 될까요? 다름이 아니라, yolact의 모든 layer를 freeze하지 않고 돌리고 싶습니다. 혹시 처음 질문하셨던 마지막 layer 제외 모두 freeze하고 돌리는 것은 어떻게 해결하셨는지 알려주실 수 있으신가요? 레이어의 일부를 자유롭게 얼리거나, 아예 scratch 부터 학습시키는 방법을 알고 계시는지 여쭙고 싶습니다ㅠㅠㅠ 급하여 옛날 질문글에 댓글 달아 죄송합니다. 도와주신다면 감사하겠습니다!

안녕하세요.

위에 현아님 메일입니다. hyunas1996@gmail.com

이전에 yolact 관련된 것 질문 드린다고 연락을 드린지 꽤 됐는데 아직도 현업에 계신지 모르겠네요.

저는 딥러닝 손 놓은지 좀 돼서 도움을 못 드리겠네요.

도움이 되셨기를 바래요~

앗 넵!! 감사합니다!!

dbolya / yolact

transfer learning #493