-
Hello,
If I understand correctly, when doing linear probing, you only train the last FC layer.
But in the classification head of the ViT, the last FC layer uses the class token, that has not been…
-
Hi Tianhong, thank you for your inspiring work! While reading the paper, I had some questions regarding the term “MAR.” Aside from the difference mentioned in the paper—where the next set of tokens in…
-
Hello,
thank you for your work and provided code! When do you plan to release code for RetroMAE v2?
-
When I use aimet autoquant to quant my model, I met the following issues:
- Prepare Model
Traceback (most recent call last):
File "/workspace/aimet/build/staging/universal/lib/python/aimet_torch/…
-
Pre-training:
1. Hyperlink-induced Pre-training for Passage Retrieval in Open-domain Question Answering(ACL2022)
2. RetroMAE v2: Duplex Masked Auto-Encoder For Pre-Training Retrieval-Oriented Langua…
-
-Hi,
I have a question regarding the use of the Temporal Fusion Transformer (TFT) model.
Is it possible to effectively use the TFT model without providing past target values in the known or unknow…
-
### Issue Description
Hello,
I am trying to generate an explanation of abstractive text summarization output for a long piece of text input. I have been using various transformers models e.g. Big…
SVC04 updated
9 months ago
-
### General
- [x] Prepare scaling plots until end of february. Y-axis: the speedup we get when running one epoch through the model for 2,4,6,8,10 GPUs
- [x] Find out how many samples we have in the …
-
## [LangChain Development](https://app.pluralsight.com/library/courses/langchain-development/table-of-contents)
by [Tom Taulli](https://app.pluralsight.com/profile/author/tom-taulli)
founder : H…
-
```
即便是加上了这个参数,仍然爆显存。。。
```
inference_pipeline = pipeline(
task=Tasks.auto_speech_recognition,
model='damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-…