-
### Issue you'd like to raise.
# The code for my model for sentiment analysis (this works, the problem is in the next part of my code)
from datasets import load_dataset,Dataset
from sentence_tran…
-
Does anyone have a rule of thumb for what size local GPU can be used to fine-tune with proprietary data? Or, is there another way to speed up the training that I'm not aware of?
I'm fine-tuning a m…
-
download_ists.sh remaining empty after bash, it only create folder ISTS. could you please fix this or provide another source?
thanks
-
I want to learn an embedding model so that it can be used to compare if two images are of the same object possibly taken from a different angle. I have a dataset of several images of several objects …
-
@KennethEnevoldsen gave me the ContrastiveTensionLoss as an example of how one could do in batch-negatives for sampling, but as you can see in [this example](https://github.com/UKPLab/sentence-transfo…
-
Thanks for this repo. Curious, is this an independent implementation of the CoCa paper? If yes, did you reproduce any result in the paper to ensure correctness of implementation?
-
Thanks for releasing the code for your amazing work!
I was trying to play with PCME/PCME++ a little bit. I have some confusion regarding the loss computation in distributed training. Specifically i…
-
### Models for spectrograms:
1. **ConvNeXT**: A pure convolutional model (ConvNet), inspired by the design of Vision Transformers, that claims to outperform them. (https://huggingface.co/docs/transfo…
-
您好!
请问在`loss = sup_loss + unsup_loss + contra_loss`中,为什么只对contra_loss进行dist.reduce操作,而sup_loss和unsup_loss都没有进行这个操作,我也看了一些其他DDP训练的loss计算方式,好像有不进行reduce操作直接backward的,也有先reduce再backward的,请问下这两种方式有区别吗?
-
## Keyword: efficient
### End-to-end codesign of Hessian-aware quantized neural networks for FPGAs and ASICs
- **Authors:** Javier Campos, Zhen Dong, Javier Duarte, Amir Gholami, Michael W. Mahoney,…