calpis10000 / commonlit

https://www.kaggle.com/c/commonlitreadabilityprize
0 stars 0 forks source link

Model_いい感じに読みやすいnbの読解 #37

Closed calpis10000 closed 3 years ago

calpis10000 commented 3 years ago

いい感じにシンプルなnotebookが公開されていたので、読んで何かを掴む

https://www.kaggle.com/andretugan/lightweight-roberta-solution-in-pytorch

calpis10000 commented 3 years ago

torch.backends.cudnn.deterministic = True cudnnによる最適化で値が変わらないためのおまじない https://qiita.com/chat-flip/items/c2e983b7f30ef10b91f6

calpis10000 commented 3 years ago

https://qiita.com/niship2/items/f84751aed893da869cec

calpis10000 commented 3 years ago

layer normalization, group normalization https://blog.albert2005.co.jp/2018/09/05/group_normalization/

calpis10000 commented 3 years ago

attentionレイヤーの構造: よくわかる注意機構(Attention) https://hilinker.hatenablog.com/entry/2018/12/08/002003

calpis10000 commented 3 years ago

model.eval()って結局何してる?って方 torch.no_grad(), torch.set_grad_enabled()の区別がわからない方

https://qiita.com/tatsuya11bbs/items/86141fe3ca35bdae7338

calpis10000 commented 3 years ago

https://zenn.dev/hirayuki/articles/bbc0eec8cd816c183408

calpis10000 commented 3 years ago

get_cosine_schedule_with_warmup

https://huggingface.co/transformers/main_classes/optimizer_schedules.html

schedulerの一種であることは分かる。

どんなスケジュールになる?

Create a schedule with a learning rate that decreases following the values of the cosine function between the initial lr set in the optimizer to 0, after a warmup period during which it increases linearly between 0 and the initial lr set in the optimizer.

うーん、わからん。

calpis10000 commented 3 years ago

成果