-
python predict_downstream_condition.py --ckpt_path model_name_roberta-base_taskname_qqp_lr_3e-05_seed_42_numsteps_2000_sample_Categorical_schedule_mutual_hybridlambda_0.0003_wordfreqlambda_0.0_fromscr…
-
我按照README文件的知识, 然后将chinese-roberta-wwm-ext-large 预训练模型文件下载好了放在./chinese-roberta-wwm-ext-large 目录下,然后我运行了python3 main.py --no_pair --seed 1 --use_apex_amp --apex_amp_opt_level O1 --batch_size 32 --max_…
-
### Feature request
I have been using [CANINE](https://arxiv.org/pdf/2103.06874) for my experiments and I see that there does not exist a fast version of the tokenizer for the model. CANINE accepts u…
-
### Motivation.
As vllm supports more and more models and functions, they require different attention, scheduler, executor, and input output processor. . These modules are becoming increasingly com…
-
### Describe the issue
First of all, thank you for your great contributions.
I have a similar question to the [issue 146](https://github.com/microsoft/LLMLingua/issues/146), I cannot reproduce the…
-
- [x] Roberta + CNN + Attention model 구현
- [x] random_concat start_idx 제대로 찾아서 맞추기
- [ ] wiki 적당한 길이로 잘라서 데이터 만들기
# 결과
1기에 사용한 모델 거의 그대로 사용해봤는데 성적이 좋아지질 않네요
데이터셋 자체에서 좀 걸러지는게 있어야 성능이 좋아질지는 테스…
-
### Describe the bug
I installed text generation webui and downloaded the model(TheBloke_Yarn-Mistral-7B-128k-AWQ) and I can't run it. I chose Transofmer as Model loader. I tried installing autoawq b…
-
When loading the model I get that some model weight are not initialized, is this expected?
from alignscore import AlignScore
scorer = AlignScore(model='roberta-large', batch_size=1, ck…
-
`Traceback (most recent call last):
File "schema_item_classifier.py", line 463, in
_train(opt)
File "schema_item_classifier.py", line 271, in _train
model_outputs = model(
File "/h…
-
Hi @NielsRogge,
I plan to finetune a LayoutXLM large like model. Why "like model"? Because until now, Microsoft did not release LayoutXLM large mas only a version base.
As I want to train a vers…
piegu updated
7 months ago