learning-rate-scheduling Search Results

1000+ results
for learning-rate-scheduling

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

lucidrains/vit-pytorch #154

Training DINO

Hello, I am trying to train DINO with a base ViT from scratch and I have some doubts. First of all, I think that in the original paper, the student temperature is 0.1 in the 30 epoch warmup, but I am…

cmartin-isla updated 2 years ago
4
artidoro/qlora #112

First training guanaco-33b-merged ,Will success ?

python qlora.py \ --model_name_or_path /models/guanaco-33b-merged \ --output_dir ./output \ --dataset alpaca \ --do_train True \ --do_eval True \ --do_mmlu_eval True \ …

apachemycat updated 1 year ago
6
Saima-786/INDUSTRIAL-PROJECT-upGrad- #1

PREDICTIVE MAINTAINCE USING MACHINE LEARNING FOR INDUSTRIAL …

[Predictive-Maintenance-using-LSTM-master.zip](https://github.com/Saima-786/INDUSTRIAL-PROJECT-upGrad-/files/15199759/Predictive-Maintenance-using-LSTM-master.zip)

Saima-786 updated 6 months ago
1
abhisheks008/DL-Simplified #263

Classification of Elon Musk Tweets using NLP

### Deep Learning Simplified Repository (Proposing new issue) :red_circle: **Project Title** : Classification of Elon Musk Tweets using NLP :red_circle: **Aim** : Create a classification model using…

abhisheks008 updated 3 months ago
9
jialuli-luka/VLN-SIG #8

Difficulty replicating the results

Hey, thanks for your great work.. There are a few clarifications I need as I am facing a bit difficulty in replicating the results, it would be very kind if you can help: 1. In the implementation …

harshraj172 updated 1 year ago
5
deeplearning4j/deeplearning4j #5843

Implement AdamW (and AdamWR) optimizer

According to this blogpost: http://www.fast.ai/2018/07/02/adam-weight-decay/ and mentioned article https://arxiv.org/abs/1711.05101, Adam has problems when used with L2 regularization. If i understand…

stolsvik updated 6 years ago
14
pytorch/pytorch #29697

MultiStepLR does not return good lr after load_state_dict

The param_group's lr's cannot be trusted if the optimizer state is not restored (and this can be okay, because optimizer buffers can double the checkpoint size). In this line they are trusted if last…

vadimkantorov updated 4 years ago
32
lessw2020/Ranger21 #36

decouple the lr scheduler and optimizer?

Hi @lessw2020, thanks for the very nice work! I noticed that in this Ranger21, the optimizer is tightly coupled with the lr scheduler, could you guide me how I can decouple them?

hiyyg updated 2 years ago
5
broadinstitute/AutoTrain #6

Advanced Prototype - AutoTrain

Interesting Resources: - [RL Curriculum Learning](https://lilianweng.github.io/lil-log/2020/01/29/curriculum-for-reinforcement-learning.html) - [meta-RL](https://lilianweng.github.io/lil-log/2019/…

ctrlnomad updated 4 years ago
2
Sejong-Kaggle-Challengers/MAIN #11

[DACON 반도체 박막 두께 분석 경진대회 1등]

[링크](https://dacon.io/competitions/official/235554/codeshare/651?page=1&dtype=recent&ptype=pub)

98hyun updated 3 years ago
5

上一页 1...6 7 8 9 10 11 12...100 下一页

1000+ results for learning-rate-scheduling

1000+ results
for learning-rate-scheduling