-
在评估tokenizer的部分给出的是tokenizer自身的评估指标,比如压缩率
但是,高压缩率的tokenizer并不意味模型的效果也更好,是否能给出最终模型层面的效果?
例如:sentencepiece实验中的BLUE
https://github.com/google/sentencepiece/blob/master/doc/experiments.md#english…
-
### Describe the bug
Ubuntu2004
py3.11
xinference latest
### To Reproduce
To help us to reproduce this bug, please provide information below:
2023-12-08 11:41:10,825 - modelscope - INFO - PyTo…
-
## DSPy and ColBERT with Omar Khattab! - Weaviate Podcast - 85
[0:00](https://www.youtube.com/watch?v=CDung1LnLbY&t=0s) Weaviate at NeurIPS 2023!
[0:38](https://www.youtube.com/watch?v=CDung1LnLbY…
-
As part of the Llama 3.1 release, Meta is releasing an RFC for ‘Llama Stack’, a comprehensive set of interfaces / API for ML developers building on top of Llama foundation models. We are looking for f…
-
lets go into self reflective, auto semiotic, stream of conciousness, free style, note taking, neologism constructing mode. consider the construction of the polynomial, each prime base carefully chosen…
-
I'm currently trying out the ollama app on my iMac (i7/Vega64) and I can't seem to get it to use my GPU.
I have tried running it with num_gpu 1 but that generated the warnings below.
`
2023/11/…
-
If I use the multiple strategies such as GPTQ + LLM-Pruner + LoRA, maybe the compressing ratio of LLM will be greatly improved with an acceptable performance?
-
My reproduction of the results on location 9 of the NQ dataset in the longllmlingua paper using the prompt compressor resulted in a large discrepancy from the original results. My hyperparameters are …
-
Description
When running the code, we successfully obtain a compressed model. However, when prompted with an input, the model generates random and repetitive outputs, often repeating the same letters…
-
Hi there!
First of all, I appreciate the team for putting in the work for this research paper. I would like to preface this by saying that my comments here should just be considered a point of disc…