roberta-model Search Results

1000+ results
for roberta-model

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

Hzfinfdu/Diffusion-BERT #17

Missing key(s) in state_dict when testing using predict_down…

python predict_downstream_condition.py --ckpt_path model_name_roberta-base_taskname_qqp_lr_3e-05_seed_42_numsteps_2000_sample_Categorical_schedule_mutual_hybridlambda_0.0003_wordfreqlambda_0.0_fromscr…

wangpichao updated 10 months ago
7
yym6472/ConSERT #18

OSError:Model name '/data/ConSERT-master/chinese-roberta-wwm…

我按照README文件的知识，然后将chinese-roberta-wwm-ext-large 预训练模型文件下载好了放在./chinese-roberta-wwm-ext-large 目录下，然后我运行了python3 main.py --no_pair --seed 1 --use_apex_amp --apex_amp_opt_level O1 --batch_size 32 --max_…

HL0718 updated 3 years ago
1
huggingface/transformers #34344

Fast tokenizer for CANINE

### Feature request I have been using [CANINE](https://arxiv.org/pdf/2103.06874) for my experiments and I see that there does not exist a fast version of the tokenizer for the model. CANINE accepts u…

capemox updated 3 weeks ago
6
vllm-project/vllm #8453

[RFC]: Support encode only models by Workflow Defined Engine

### Motivation. As vllm supports more and more models and functions, they require different attention, scheduler, executor, and input output processor. . These modules are becoming increasingly com…

noooop updated 1 month ago
4
microsoft/LLMLingua #155

[Question]: Reproduce LLMLingua-2 results with Mistral-7B

### Describe the issue First of all, thank you for your great contributions. I have a similar question to the [issue 146](https://github.com/microsoft/LLMLingua/issues/146), I cannot reproduce the…

xvyaward updated 1 month ago
4
boostcampaitech2/mrc-level2-nlp-04 #42

[TODO] Roberta + CNN + Attention model 구현 및 random_concat & …

- [x] Roberta + CNN + Attention model 구현 - [x] random_concat start_idx 제대로 찾아서 맞추기 - [ ] wiki 적당한 길이로 잘라서 데이터 만들기 # 결과 1기에 사용한 모델 거의 그대로 사용해봤는데 성적이 좋아지질 않네요 데이터셋 자체에서 좀 걸러지는게 있어야 성능이 좋아질지는 테스…

raki-1203 updated 3 years ago
1
oobabooga/text-generation-webui #6460

Can't load awq model

### Describe the bug I installed text generation webui and downloaded the model(TheBloke_Yarn-Mistral-7B-128k-AWQ) and I can't run it. I chose Transofmer as Model loader. I tried installing autoawq b…

nNote1377 updated 3 weeks ago
2
yuh-zha/AlignScore #7

Not initialized model weights when loading checkpoint

When loading the model I get that some model weight are not initialized, is this expected? from alignscore import AlignScore scorer = AlignScore(model='roberta-large', batch_size=1, ck…

jacopobandonib updated 8 months ago
2
RUCKBReasoning/RESDSQL #69

Training cross-coder error

`Traceback (most recent call last): File "schema_item_classifier.py", line 463, in _train(opt) File "schema_item_classifier.py", line 271, in _train model_outputs = model( File "/h…

zhihui-shao updated 7 months ago
1
NielsRogge/Transformers-Tutorials #253

How to change LayoutLMv3 to a LayoutLMv3 XLM (i.e. LayoutXLM…

Hi @NielsRogge, I plan to finetune a LayoutXLM large like model. Why "like model"? Because until now, Microsoft did not release LayoutXLM large mas only a version base. As I want to train a vers…

piegu updated 7 months ago
6

上一页 1...7 8 9 10 11 12 13...100 下一页

1000+ results for roberta-model

1000+ results
for roberta-model