[Week 7] 데이터 셋의 편향성이 정말로 없는가?

kimcando / BoostcampAITech3-PaperReading-Embedding

Boostcamp AI Tech 3rd / Basic Paper reading w.r.t Embedding

13 stars 3 forks source link

논문에서 수집한 데이터에 대한 설명으로 "The dataset has books in 16 different genres, e.g., Romance (2,865 books), Fantasy (1,479), Science fiction (786), Teen (430), etc." 라고 적혀있습니다. 그런 다음에 "Furthermore, with a large enough collection the training set is not biased towards any particular domain or application."라고 언급하였는데, 이게 맞는 언급인지 궁금합니다.. 마지막 문장 생성 예시도 로맨스 쪽에 가까운데 편향성이 정말 없는지 궁금합니다..

kimcando / BoostcampAITech3-PaperReading-Embedding

[Week 7] 데이터 셋의 편향성이 정말로 없는가? #24