-
**Describe**
Model I am using (UniLM, MiniLM, LayoutLM ...):e5
Can I use deepspeed for evaluating beir benchmark? Or does it support other strategies?
Can I use fp16? Would it impact results?
-
Hi,
We are trying DeepSpeed with UniLM using FUNSD example that is provided here - https://github.com/microsoft/unilm/tree/master/layoutlmv2
It takes 18-ish minutes to finish training without us…
-
`CUDA_VISIBLE_DEVICES=0 python3 run_seq2seq.py --data_dir ./dataset/ --src_file train.tsv --model_type unilm --model_name_or_path ./torch_unilm_model/ --output_dir ./output_dir/ --max_seq_length 512…
-
Hi,
I am following [this command](https://github.com/microsoft/unilm/tree/master/trocr#fine-tuning-on-iam) for fine-tuning on IAM but got the error:
```shell
Traceback (most recent call last):
…
-
-
Hi there!
![image](https://github.com/huggingface/nanotron/assets/49240599/38bc4c4d-f0ec-40f1-bd57-2679c7fe03f4)
Microsoft have just released the full handbook for reproduing the 1-bit LLM pape…
-
Do I have to use an audio sequence with a sampling rate of 16k to use BEATs?
Because I found that when I further input the extracted features into resnet18 for the next classification task, I found t…
-
I had a passing idea about whether it is possible to use quantization for embedding models using Mistral?
However, one problem is that currently the checkpoint lacks the ``lm-head`` part, so I'm wo…
-
楼主你好,我数据量大约130万,src长度大约100,tgt长度大约40,用unilm多少epoch收敛比较的好?我现在4个epoch发现预测的标题,会出现不通顺/字符重复的问题
-
Thanks for the well-written package! The RetNet's official implementation had several updates at https://github.com/microsoft/unilm/blob/master/retnet/README.md#changelog .