-
### Python -VV
```shell
Python 3.10.13 (main, Sep 11 2023, 13:44:35) [GCC 11.2.0]
```
### Pip Freeze
```shell
accelerate==0.33.0
addict==2.4.0
annotated-types==0.7.0
apex @ file:///data2/apex
…
-
Hi there,
Using the following code (following the example)
```
import os
import sys
import torch
sys.path.append(os.path.dirname(__file__))
from simul_whisper.transcriber.config import Al…
-
I only copied the code from the ReadMe, I installed the LLama NuGet package with the CPU-Only backend, and it always returns
System.AccessViolationException: "Attempted to read or write protected …
-
Officially, multi-language support is still not implemented in distil-whisper.
But I noticed, that the esteemed @sanchit-gandhi uploaded a German model for distil-whisper to HuggingFace, called 'di…
-
How to run Stanford Sentiment Treebank(SST-2) task with BERT?
-
When using the non-streaming interface, I can obtain the number of tokens returned from 'chatCompletions.getUsage().getTotalTokens()'. However, how can I determine the number of tokens returned when u…
-
Ever since upgrading to v3.0.0, I've been seeing pr-tracker-fetcher take a *really* long time to insert landings. For example, here's some info about a current run:
```
[root@clark:~]# ps -efw | g…
jfly updated
5 months ago
-
If I want to work with multimodal LLMs that takes in a set of embedding from vision/audio encoders, what is the proper way of inputting them into a LLM running using exllamav2?
Can I just add a custo…
-
使用官方的推理代码,模型是官方的33B
```
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/deepseek-coder-6.7b-base", trust_remote_cod…
-
I updated Ollama from 0.1.16 to 0.1.18 and encountered the issue.
I am using python to use LLM models with Ollama and Langchain on Linux server(4 x A100 GPU).
There are 5,000 prompts to ask and get…