-
Hi, I'm experiencing an Assertion Error during training of miniLLM using ZeRO with optimizer and parameter offload on a single H100 GPU. It seems as though deepspeed's parameter offload script is gett…
-
https://github.com/microsoft/LMOps/blob/80d7d4a0ba8d61ca7be6cae72d06cf71dda3e9e0/minillm/scripts/gpt2/eval/eval_main_self_inst.sh#L18C32-L18C32
> CKPT="${BASE_PATH}/results/gpt2/${CKPT_NAME}/"
It …
-
I'm running the inference script with `bash inference_hf.sh`. But I'm getting some error related to path.
```
[2023-10-17 18:06:41,654][root][INFO] - Total encoded queries tensor torch.Size([277, …
-
I'm trying to run the generate_dense_embeddings script with the following command
```
python DPR/generate_dense_embeddings.py model_file=/root/LMOps/uprise/archive/data.pkl ctx_src=dpr_uprise sh…
-
Hello all, when I run python raw2read.py I am getting "NameError: name 'overall_cls' error. Here I am providing part log.
Help me in fixing in this issue.
PS C:\Users\rajas\Desktop\AI_Research\LMO…
-
When I run this instruction `bash scripts/opt/tools/process_data_dolly.sh /PATH/TO/MiniLLM # Process Dolly Train / Validation Data`,it has some error messages like 'huggingface_hub.utils._validators.H…
-
At the evaluation phase of llama-7b/gpt2-xlarge whose `MP_size=1`, I try to use 8 gpus to accelerate the evaluation phase. The code is `scripts/gpt2/eval/run_eval.sh`.
I simplify this code to only …
-
Looking forward to your code releasing of llm_retriever :)
-
I want to run scripts/llama/eval/eval_main_dolly.sh to evaluate sft/llama-13B, I have access to 1 A100 gpu OR 4 A10 gpus, how should I modify the scripts/llama/eval/eval_main_dolly.sh file to get it w…
-
openai always timeout or raise exception, is there a plan to support openai request timeout and retry?