mixed precision 적용하기

목표 : mixed precision 적용하기

pytorch에서 공식적으로 mixed precision 제공
Automatic Mixed Precision(AMP) 라는 이름으로, 몇 줄의 코드만 추가하면 손쉽게 사용 가능
참고 : https://pytorch.org/docs/stable/notes/amp_examples.html
check 해야할 것
- memory가 얼마나 줄어드는지
- train time
- test accuracy

amp를 적용한 예제 code

""" define loss scaler for automatic mixed precision """
# Creates a GradScaler once at the beginning of training.
scaler = torch.cuda.amp.GradScaler() ---# here

for batch_idx, (inputs, labels) in enumerate(data_loader):
  optimizer.zero_grad()

  with torch.cuda.amp.autocast():
    # Casts operations to mixed precision 
    outputs = model(inputs)
    loss = criterion(outputs, labels)

  # Scales the loss, and calls backward() ---# here
  # to create scaled gradients 
  scaler.scale(loss).backward()

  # Unscales gradients and calls ---# here
  # or skips optimizer.step() 
  scaler.step(self.optimizer)

  # Updates the scale for next iteration ---# here
  scaler.update()

아주 simple(here 부분 check)

backtime92-new-base-syn vs backtime92-new-base-syn-amp
- amp의 효과 비교
amp를 사용함으로써 gpu(2080 ti) 메모리는 당연 적게 사용한 것을 볼 수 있고, training time은 1시간 정도 줄어든 것을 확인
f1-score 는 best model의 성능을 비교했을 때 같은 성능

참고 : gpu memory

backtime92-new-base-syn
backtime92-new-base-syn-amp

daintlab / CRAFT-Reimplementation

mixed precision 적용하기 #5

목표 : mixed precision 적용하기