-
# dzNodes: LayerStyle -> Error loading model large-PromptGen-v1.5: The checkpoint you are trying to load has model type `florence2` but Transformers does not recognize this architecture. This could be…
-
edit: both 3.1 and 3.2 fail
- 3.1 support is fixed by `pip install transformers==4.43.2` - thank you, @jinxiangshi
- 3.2 isn't yet supported by TRT-LLM - @laikhtewari promised to update the docs to …
-
**Describe the bug**
A clear and concise description of what the bug is.
When I use export FMOE_FASTER_SHADOW_ENABLE=1 and export FMOE_FASTER_SCHEDULE_ENABLE=1 to turn on Smart schedule,and bash exa…
-
The basic idea and methodology of CutAddPaste are highly similar to the ICLR23 paper AnomalyBERT [1]. However, there is no reference to AnomalyBERT and I can't find any discussion regarding the differ…
-
Hi there,
I was struggling on how to implement quantization on autoawq as you mentioned in home page. I was trying to quantize 7b qwen2 vl but no matter I use 2 A100 80Gb vram, I still get cuda oom…
-
### Issue
tl;dr - I think there is some possible doubling in the reporting of sent tokens on the summary/cost estimate line and I think aider isn't accounting for the cached input rate when calculati…
-
按照 readme的方式准备的环境,进行inference 报错:
Traceback (most recent call last):
File "/mnt/localdisk/tanm/miniconda/envs/table_llava/lib/python3.10/site-packages/transformers/feature_extraction_utils.py", li…
-
### Area of Improvement
Why is the following section in the docs correct:
> Here [superjson](https://github.com/blitz-js/superjson) is used for uploading and [devalue](https://github.com/Rich-Harr…
-
Content is extracted when a developer binds an extractor to a data repository. As new content lands the extractors are applied on the content and the derived information is written to indexes.
Ext…
-
Description: When running inference on the distilbert-base-uncased model using the NPU on Snapdragon® X Elite (X1E78100 - Qualcomm®) through ONNX Runtime's QNNExecutionProvider, the model fails to inf…