-
### System Info
A100
### Who can help?
@juney-nvidia
@ncomly-nvidia
@kaiyux
@byshiue
### Information
- [ ] The official example scripts
- [X] My own modified scripts
### Tasks
- [ ] An offic…
-
I am building a tool which would extract data from a potentially large JSON. If data is ndjson, then it is easy to read it line by line and extract data from each separate object. But if data is in a …
mitar updated
3 hours ago
-
在seamless_streaming_unity.yaml配置文件中,修改了char_tokenizer: 和checkpoint:参数,改成了我下载好的权重路径,为什么推理运行还要下载权重呢?
-
### Describe the issue
I use the [Local-LLMs/](https://microsoft.github.io/autogen/blog/2023/07/14/Local-LLMs/) to deploy my local model
but the result by llm is weird
### Steps to reproduce
## lu…
-
Right now, stemming is done after the strings are split and converted to IDs:
https://github.com/xhluca/bm25s/blob/73c7dea9ea7f88a23a7fa9a94e9a7bca48669f1c/bm25s/tokenization.py#L152-L177
Howeve…
-
Training a BPE tokenizer from scratch, I am using Split pretokenization. In the below example, I split on each digit so that numbers are represented by the sequences of digits they are made of.
```…
-
**Description**
[Add a description of the feature]
Since we now required `nltk` and the `punkt` tokenizer during the validation loop for chunking during streaming, we should either download and dist…
-
### System Info
Node.js 22.4.0
@xenova/transformers 2.17.2
### Environment/Platform
- [ ] Website/web-app
- [ ] Browser extension
- [X] Server-side (e.g., Node.js, Deno, Bun)
- [ ] Desktop app (e.…
-
### Python -VV
```shell
Python 3.10.13 (main, Sep 11 2023, 13:44:35) [GCC 11.2.0]
```
### Pip Freeze
```shell
accelerate==0.33.0
addict==2.4.0
annotated-types==0.7.0
apex @ file:///data2/apex
…
-
case1:use tensorrtllm
python3 /tensorrtllm_backend/tensorrt_llm/examples/run.py --engine_dir "/data512/tensorrtllm_backend/triton_model_repo/tensorrt_llm/1/" \
--max_output_len 2048 \
…