boost-devs / peer-session

๐Ÿš€ ๋ถ€์ŠคํŠธ์บ ํ”„ AI Tech 1๊ธฐ U-Stage 4์กฐ ํ”ผ์–ด ์„ธ์…˜ ์ž๋ฃŒ/์งˆ๋ฌธ ๋ชจ์Œ (archived)
8 stars 2 forks source link

[์›๋”œ] Gaussian Process #104

Closed changwoomon closed 3 years ago

changwoomon commented 3 years ago

๐Ÿ™Œ ์งˆ๋ฌธ์ž


โ“ ์งˆ๋ฌธ ๋‚ด์šฉ

์•ˆ๋…•ํ•˜์„ธ์š” ์กฐ๊ต๋‹˜.

์ˆ˜์—… ๋‚ด์šฉ ์ค‘ Hyperparameter search์—์„œ Gaussian Process์˜ ๊ณผ์ •์ด ์ดํ•ด๊ฐ€ ๋˜์ง€ ์•Š์•„ ์งˆ๋ฌธ๋“œ๋ฆฝ๋‹ˆ๋‹ค.

๊ฐ•์˜ ์Šฌ๋ผ์ด๋“œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.


๊ฐ•์˜ ์Šฌ๋ผ์ด๋“œ์™€ ์•„๋ž˜์˜ ๊ทธ๋ฆผ๋“ค๋กœ๋ถ€ํ„ฐ ์ €ํฌ๊ฐ€ ์ดํ•ด(์ถ”๋ก )ํ•œ ๋ฐ”๋กœ๋Š”

โ–ผ ๊ทธ๋ฆผ1                                                                                     โ–ผ ๊ทธ๋ฆผ2

โ–ผ ๊ทธ๋ฆผ3

[๊ทธ๋ฆผ1] ์—์„œ, 

- x์ถ• : hyper parameter ๊ฐ’์˜ ์˜์—ญ

- ์ ์„  : ์ตœ์ข…์ ์œผ๋กœ ๋งž์ถ”๊ณ ์ž ํ•˜๋Š” objective function (= [๊ทธ๋ฆผ3]์—์„œ์˜ ์‹ค์„ , True function)

- ์‹ค์„  : ๊ฐ ์ง€์ (x)์˜ posterior mean function (= [๊ทธ๋ฆผ3]์—์„œ์˜ ์ ์„ , Prediction function)

- ๋ณด๋ผ์ƒ‰ ์˜์—ญ: ๊ฐ ์ง€์ (x)์˜ ๋ถ„์‚ฐ

- ์‹ค์„ , ๋ณด๋ผ์ƒ‰ ์˜์—ญ, acquisition function์€ ๊ด€์ธก๊ฐ’(x_t)์— ๋Œ€ํ•œ posterior function์œผ๋กœ ์ƒˆ๋กœ์šด ๊ด€์ธก์„ ํ•  ๋•Œ๋งˆ๋‹ค ๋ฐ”๋€œ

๋ผ๊ณ  ์ƒ๊ฐํ–ˆ๊ณ , Gaussian Process์˜ ์ง„ํ–‰ ๊ณผ์ •์ด

1. ์‹ค์„ ์„ ์ ์„ ์— ๋งž์ถ”๊ธฐ ์œ„ํ•ด ์•„๋ž˜์˜ ๊ณผ์ •์„ ๋ฐ˜๋ณตํ•œ๋‹ค.

2. ์‹œ์  t์˜ acquisition function์˜ ๊ฐ’์ด ์ตœ๋Œ€์ธ ์ง€์ (x_t)๋ฅผ ์ƒˆ๋กœ์šด ๊ด€์ธก๊ฐ’์œผ๋กœ ์ •ํ•จ

3. ์ƒˆ๋กœ์šด ๊ด€์ธก๊ฐ’(x_t)์˜ ๊ฐ’์„ hyper parameter๋กœ ๋‘๊ณ  ๋ชจ๋ธ์„ ํ•™์Šตํ•˜์—ฌ posterior mean, ๋ถ„์‚ฐ, acquisition function์„ ๊ณ„์‚ฐ

4. 2-3๋ฒˆ์„ ๋ฐ˜๋ณต

๋ผ๊ณ  ์ดํ•ดํ–ˆ์Šต๋‹ˆ๋‹ค.

Q1. ์ €ํฌ๊ฐ€ ์ดํ•ดํ•œ ๊ฒƒ์ด ๋งž๋‚˜์š”?

Q2. hyper parameter์˜ ์ฐธ๊ฐ’์€ ์—†์„ํ…๋ฐ True function์€ ๋ฌด์—‡์ธ๊ฐ€์š”?

Q3. ๋ถ„์‚ฐ์ด ๊ฐ€์žฅ ํฐ ์ง€์ ์„ hyper parameter๋กœ ๋‘๊ณ  ํ•™์Šตํ•˜์ง€ ์•Š๊ณ  acquisition function์ด ๊ฐ€์žฅ ํฐ ์ง€์ ์„ hyper parameter๋กœ ๋‘๊ณ  ํ•™์Šต์„ ํ•˜๋Š”๋ฐ, ๊ทธ๋ ‡๋‹ค๋ฉด ๋ถ„์‚ฐ์€ ์™œ ๊ตฌํ•˜๋Š”๊ฑด๊ฐ€์š”?


๐Ÿ“„ ์ฐธ๊ณ  ์ž๋ฃŒ

changwoomon commented 3 years ago

๋ฅ˜์›ํƒ ์กฐ๊ต๋‹˜ ๋‹ต๋ณ€

์•ˆ๋…•ํ•˜์„ธ์š”. ๊ฒฝ๋Ÿ‰ํ™” ํŒ€ ์กฐ๊ต ๋ฅ˜์›ํƒ์ž…๋‹ˆ๋‹ค.

์ฃผ์‹  ์งˆ๋ฌธ์ค‘์—์„œ ์ผ๋ถ€์— ๋Œ€ํ•ด์„œ ๋‹ต๋ณ€๋“œ๋ฆด๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.

์‹ค์„ ์„ ์ ์„ ์— ๋งž์ถ”๊ธฐ ์œ„ํ•ด ์•„๋ž˜์˜ ๊ณผ์ •์„ ๋ฐ˜๋ณตํ•œ๋‹ค. -> ์ ์„ ์„ ์‹ค์„ ์— ๋งž์ถ”๊ธฐ ์œ„ํ•œ ์ž‘์—…์ด๋ผ๊ณ  ์ƒ๊ฐํ•˜์‹œ๋ฉด ๋ ๋“ฏํ•ฉ๋‹ˆ๋‹ค.

Trun function์ด๋ผ๋Š” ๊ฒƒ์€ ์ •๋‹ต์ด๋ผ๊ณ  ์ƒ๊ฐํ•˜์‹œ๋ฉด ๋ฉ๋‹ˆ๋‹ค.

gaussain process์— ๋Œ€ํ•œ ๋ธ”๋กœ๊ทธ๊ธ€ ํ•˜๋‚˜๋ฅผ ์ถ”์ฒœ๋“œ๋ฆฝ๋‹ˆ๋‹ค.

http://krasserm.github.io/2018/03/19/gaussian-processes/

changwoomon commented 3 years ago

Slack - ์งˆ์˜์‘๋‹ต ์ฑ„๋„

ํ™์›์˜_๋งˆ์Šคํ„ฐ 3์ผ ์ „ variance๊ฐ€ ํฐ ์ง€์ ํ•˜๊ณ  acquisition function์ด ๊ฐ€๋ฆฌํ‚ค๋Š” ์ง€์ ํ•˜๊ณ  ๋‹ค๋ฅธ ์ด์œ ๊ฐ€ ๋ญ˜๊นŒ์š”?

์œค๋Œ€ํ˜_T1133 3์ผ ์ „ acquisition function์€ ๋ฐ์ดํ„ฐ๋กœ๋ถ€ํ„ฐ ๊ฐ€์žฅ ๊ฐ€๋Šฅ์„ฑ์ด ๋†’์€ likelihood๋ฅผ ์ถ”์ •ํ•œ ๊ฒƒ ๊ฐ™๊ณ  ๋ถ„์‚ฐ์€ true function์˜ ๋ถˆํ™•์‹คํ•œ ๋ถ€๋ถ„์„ ๊ณ ๋ คํ•˜๊ธฐ ์œ„ํ•ด ์ถ”์ •ํ•˜๋Š” ์•„์˜ˆ ๋‹ค๋ฅธ ๊ฐœ๋…์ด ์•„๋‹Œ๊ฐ€์š”

ํ™์›์˜_๋งˆ์Šคํ„ฐ 3์ผ ์ „ activation function := get the point with the highest variance ๋กœ ์ •ํ•ด์ฃผ์–ด๋„ ์•„์˜ˆ ๋‹ค๋ฅธ ๊ฐœ๋…์ด ๋ ๊นŒ์š”...?

์œค๋Œ€ํ˜_T1133 3์ผ ์ „ reference ๋…ผ๋ฌธ์„ ์ข€ ๋ดค๋Š”๋ฐ ์ œ๊ฐ€ ์ฒ˜์Œ์— ์ƒ๊ฐํ•œ๊ฑฐ๋ž‘ ๋งŽ์ด ๋‹ค๋ฅด๋„ค์š”... acquisition function์œผ๋กœ variance๊ฐ€ ํฐ ํฌ์ธํŠธ๋ฅผ ์žก์„์ˆ˜๋„ ์žˆ๋Š”๋ฐ(exploration) ํ•œํŽธ์œผ๋กœ๋Š” mean์ด ํฐ ํฌ์ธํŠธ๋ฅผ ์žก์„์ˆ˜๋„ ์žˆ๊ณ (exploitation) ๊ทธ๋Ÿฐ๋ฐ trade off๊ด€๊ณ„๊ฐ€ ์žˆ์–ด์„œ ์–ด๋Š ํ•œ์ชฝ ๋งŒ์„ ์ทจํ•˜๋ฉด ๋ฌธ์ œ๊ฐ€ ์žˆ์–ด์„œ ์—ฌ๋Ÿฌ๊ฐ€์ง€ acquisition function์„ ์ œ์•ˆํ•ด์ฃผ๋Š”๊ฒƒ ๊ฐ™์€๋ฐ์š”..