[Bug] Monte Carlo Control에서 Q값이 업데이트 되지 않는 현상 - Githubissues

oeccsy / SecretaryProblem

Unity ML-Agent를 이용한 최적 정책 검증

0 stars 0 forks source link

[Bug] Monte Carlo Control에서 Q값이 업데이트 되지 않는 현상 #4

Closed oeccsy closed 3 months ago

oeccsy commented 3 months ago

재현 과정

python script secretary_problem_case_monte_carlo_control.py 에 아래 코드와 같이 print() 코드 추가

def update_agent(history):
cum_reward = 0
for transition in history[::-1]:
order, ranking, a, r = transition

# 몬테카를로 방식으로 업데이트
Q[order-1, ranking-1, a] = Q[order-1, ranking-1, a] + alpha * (cum_reward - Q[order-1, ranking-1, a])
cum_reward = cum_reward + r
print(f'update : [{order-1},{ranking-1}] 에서 {a}한 결과 {r}')
print(f'Q[{order-1},{ranking-1},{a}]값이 {Q[order-1, ranking-1, a]}로 업데이트 됨')

이후 해당 스크립트 실행하여 결과 확인

오류 내용

terminate state의 경우 reward가 발생해도 Q값 업데이트 안됨

기대 결과

reward에 따라 Q값 업데이트

참고 자료

oeccsy commented 3 months ago

아래와 같이 코드 수정하여 해결

# 몬테카를로 방식으로 업데이트
cum_reward = cum_reward + r
Q[order-1, ranking-1, a] = Q[order-1, ranking-1, a] + alpha * (cum_reward - Q[order-1, ranking-1, a])