-
I noticed a number of various things are incorrectly implemented.
```python
classifier = pipeline("sentiment-analysis", device="cpu",
model="distilbert/distilbert-base-uncased-fin…
-
### Your current environment
Packages used for both finetuning and inference (vllm==0.3.2):
torch==2.1.2
accelerate==0.27.2
transformers==4.40.1
sentence_transformers==2.7.0
Description:
…
-
Hello, I wonder why adding special tokens in [blip2_vicuna_instruct.py](https://github.com/salesforce/LAVIS/blob/main/lavis/models/blip2_models/blip2_vicuna_instruct.py#L86-L89)?
```
self.llm_tokeni…
-
In paper titles with multiple sentences, the first word of non-initial sentences should probably be ``d.
This happens e.g. here:
[How Good is Your Tokenizer? On the Monolingual Performance of Mult…
-
Certain fields in the tokenizer was not checked when exporting with onnxruntime-extension pnp module, causing a mismatch for cls_token and sep_token.
# code showing the difference
import onnxrunt…
-
I am using `intfloat/e5-mistral-7b-instruct` model to get last hidden state for my input and compute cosine similarity.
I am using a toy example provided at: https://huggingface.co/intfloat/e5-mist…
-
```python
def generate_tokenize_dataset_func(dataset_sample):
prompt = f"""
You are a helpful assistant.
The dataset is huggingface datasets.Dataset.
The first element of the…
-
(See #10)
The current tokenizer doesn't to be cutting it. For example, cases where we likely want another tokenization (that is, the following are full tokens):
* golgin-84
* VBA1)-deleted
* p…
-
Hello ! I'm trying to implement bert-base but I have not clear how do you generate the masks with the TapeTokenizer. This is my code
```
model = ProteinBertModel.from_pretrained('bert-base')
tokeni…
-
So this is a strange one. I am stumped.
In way, this is sort of like #416, but I confirmed that if Batch==1, then the problem does not occur. (See below)
My inference loop looks like this
```
…