-
After running the default flow on Mistral in vLLM, there is a large (>100MB) report json in the directory I ran the commands. This seems quite heavy-weight, especially for a json file.
Instead, I …
mgoin updated
2 months ago
-
Hi, I'm trying to fine-tune the Llama3.1 8b model but after fine-tuning it uploading it to HF, and when trying to run it using vLLM I get this error "KeyError: 'base_model.model.model.layers.0.mlp.dow…
-
Hello. Can you please tell me which evolutionary search hyperparameters (population_size, mutation_numbers, crossover_size, etc.) you used to 8x increase the context length of the Mistral v0.1 or LLaM…
-
Hello and thank you for the great product.
I experience a this trouble when I try to use it with local llama models.
At first it starts to generate some code and somewhere in the middle I receiv…
-
Hi
Using Ubuntu 22.
both commands nvcc --version and nvidia-smi are showing valied outputs.
I've noticed that the GPU is not utilized when running larger models (e.g., MiXtral8x7B, Llama 70B), …
-
There have been many discussions in the community regarding support for multiple models.
- ChatGPTNextWeb#3484
- ChatGPTNextWeb#3923
- ChatGPTNextWeb#960
- ChatGPTNextWeb#3431
- ChatGPTNextWeb#…
-
Llama 3.1
https://ai.meta.com/blog/meta-llama-3-1/
https://ai.meta.com/research/publications/the-llama-3-herd-of-models/
https://about.fb.com/news/2024/07/open-source-ai-is-the-path-forward/
A…
-
Hello, this is a minimal rust code comparable with llama.c, however, interms of speed, how much lower compare with other pure rust lib?
AFAIK, there are mistral.rs etc did almost same thing.
-
**Is your feature request related to a problem? Please describe.**
We extend OpenAIChatGenerator for MistralChatGenerator. This works for chat completion but not for function calling. Mistral's funct…
-
## AAAI-24
Benchmarking Large Language Models in Retrieval-Augmented Generation
https://arxiv.org/abs/2309.01431
Hot or Cold? Adaptive Temperature Sampling for Code Generation with Large Langua…