-
Description: When running inference on the distilbert-base-uncased model using the NPU on Snapdragon® X Elite (X1E78100 - Qualcomm®) through ONNX Runtime's QNNExecutionProvider, the model fails to inf…
-
- [x] Measure and record current performance.
- [x] Rebase the model to main, ensure the PCC = 0.99
- [x] Port functionality to n300 card (single device)
- [x] Provide Op Report
- [x] Check Model into…
-
Hello, I created a test script which I was testing on Aarch64 platform, for distilbert inference and using wanda sparsifier:
```
import torch
from transformers import BertForSequenceClassificatio…
-
Dear quanto folks,
I implemented quantization as suggested in your coding example [quantize_sst2_model.py](https://github.com/huggingface/optimum-quanto/blob/main/examples/nlp/text-classification/s…
-
- [ ] Adapt the distilbert data parallel pipeline to function based on the available machine.
-
First of all, thank for sharing your code. And I have succeed replay your work on resnet and densenet. But I have meet some trouble on replaying Bert.
1. I conda the environment based on requirmen…
-
```python
import torch
import torch_directml
from transformers import AutoModelForSequenceClassification
from transformers import AutoTokenizer
dml = torch_directml.device()
# -----------…
-
in the views.py file we are loading this model but i am unable to locate it. Do we have to download it externally or what ? I tried downloading it through 'https://huggingface.co/distilbert/distilber…
-
I noticed a number of various things are incorrectly implemented.
```python
classifier = pipeline("sentiment-analysis", device="cpu",
model="distilbert/distilbert-base-uncased-fin…
-
Hey folks,
I ran your example file for static quantization using distilbert-base-uncased-finetuned-sst-2-english on the dataset "sst2". The quantized model was 90 % faster than the single precision…