Hyperparameter tuning (nnl, Ray Tune)

Hyperparameter tuning Strategy

하이퍼파라미터를 튜닝할 때 적용해 볼 방법을 정리
같은 모델에서 학습환경을 조절하는 방법으로 학습속도 향상과 성능 향상 효과를 볼 수 있음

!ref
-Bag of Tricks for Image Classification with Convolutional Neural Networks

1 Learning rate

batch size를 키우면 그만큼 linear하게 learning rate를 키움
learning rate = 0로 시작해서 초기값까지 linear하게 learning rate를 키움
학습이 안정되면 learning rate scheduler로 cosine annealing를 적용

2 Normalization

residual connection이 존재하는 경우 batch norm의 𝛾를 0으로 설정
L2 norm을 weight에만 적용 (bias 적용 X)

3 Model Tweaks

Input system : size를 줄이고 layer를 깊게
Down sampling block(residual) : size ↑
Down sampling block(shortcut connection) : AvgPool

4 Label Smoothing

label로서 one-hot vector 대신 0을 ε으로 대체

Ray Content

버클리 대학의 RISE 연구실에서 출발하여 Anyscale에서 만듦. Apache Arrow를 사용해 데이터를 효율적으로 처리함. 프로세스 기반 분산처리, 병렬처리 로컬 환경, 클라우드의 쿠버네티스(AWS, GCP, Azure) 환경, 온프레미스 쿠버네티스 등 다양한 환경에서 사용할 수 있음

Driver : 프로그램의 메인루트. ray.init() 로 호출함
Job : 동일한 드라이버에서 발생한 Task, Actor, Object의 collection.

Task

호출하는 곳과 다른 프로세스에서 실행되는 함수
@ray.remote라는 데코레이터로 정의하며 remote func이라고도 부름
remote() method로 호출하면 ObjectRef 를 반환
ray.get(ObjectRef) 를 하여 Task 실행
@ray.remote 로 감싼 func은 stateless

cf) stateless vs stateful

statelss : 과거 트랜잭션에대한 정보 또는 참조를 저장하지 않음. 따라서 각 트랜잭션은 모두 처음부터 시작함
stateful : 전 트랜잭션의 컨텍스트에 따라 수행되며, 현재 트랜잭션은 이전 트랜잭션에서 발생한 상황에 영향을 받음

Actor

호출하는 곳과 다른 프로세스에서 실행되는 클래스
@ray.remote라는 데코레이터로 정의함
remote() method로 호출하면 Actor Class 를 반환
ray.get(ObjectRef) 를 하여 Task 실행
@ray.remote 로 감싼 Class instance는 stateful

Object

Task를 통해서 반환되거나 ray.put()을 통해 생성되는 값
데이터를 공유 메모리에 저장하여 복사본을 만들지 않고 모든 프로세스에서 접근함
큰 데이터를 반복적으로 사용한다면 ray.put()을 통해 메모리 사용을 줄일 수 있음

Ray Manual

#설치
pip install ray

#__1. ray 초기화
import ray
ray.init()

#__2. 병렬처리할 func 또는 class에 decorator 추가
import torch

@ray.remote
def create_matrix(size):
    return torch.randn(size, size)

@ray.remote
def dot_product(x, y):
    return torch.dot(x, y)

#__3. remote func(or class)를 remote() method로 호출
x_id = create_matrix.remote(10)
y_id = create_matrix.remote(10)
z_id = multiply_matrices.remote(x_id, y_id)

#__4. task 실행 (값 반환)
z = ray.get(z_id)

#__5. 프로세스 종료
ray.shutdown()

!ref1 !ref2 !ref3

boostcampaitech2 / image-classification-level1-08

Hyperparameter tuning (nnl, Ray Tune) #8