-
Right now, all media loading is done in parallel, which isn't ideal and can result in unnecessary dropped frames (observed by @aubilenon).
In an ideal world:
- high priority: media frames that wil…
-
### News
- Conferences
- AAAI 2023: Washington DC (2. 7 - 14)
- [Google Cloud가 Anthropic과 손을 잡고 MS + OpenAI 조합에 대항?](https://www.googlecloudpresscorner.com/2023-02-03-Anthropic-Forges-Partnership…
-
### System Info
Python 3.10.11
transformers 4.40.0
torch 2.0.1
Linux version 4.15.0-55-generic x86_64
### Who can help?
@ArthurZucker @gante
### Information
- [ ] The official example scripts
…
-
To the best of my knowledge, speculative decoding does not change the decoding result when using greedy decoding. However, I noticed that the rouge2 metrics of 'base' and 'essg' may be different in th…
-
### System Info
transformers==4.39.1
python==3.8.17
torch==2.0.1+cpu
### Who can help?
@sanchit-gandhi
### Information
- [ ] The official example scripts
- [ ] My own modified scr…
-
### Your current environment
```
Collecting environment information...
INFO 05-23 16:19:36 pynccl.py:58] Loading nccl from library librccl.so.1
/opt/conda/envs/py_3.9/lib/python3.9/site-packages/t…
-
### 🚀 The feature, motivation and pitch
Currently, vllm with Speculative Decoding requires that the draft model and target model have the same vocab size. However, the target model may have a large…
-
After training, the output folder only contain files like `meta_model_0.pt`. If I try to use vllm server to serve this model like this: `python -m vllm.entrypoints.openai.api_server --model finetuned_…
-
- I change the [batch size](https://github.com/flexflow/FlexFlow/blob/inference/inference/models/opt.cc#L71) to 2 .
- Then I use the below command to execute the opt -6.7b
`../build/inference/spec…
-
The following program encodes that same ASCII string using a naive approach and using actual `UTF8.encode()`. The naive approach is about `3 times` faster. Could UTF8 be optimized to provide better pe…