-
Hi, are you going to release pretraining code?
-
Hi,
I am looking for ImageNet pretrained weights of the YOLOX backbone. I am specifically interested in the largest model YOLOX-x. In a couple of other issues I've seen that nano version can be train…
-
**Describe the bug**
Runing the Pretraining *BERT* encountered two issues:
1. the "TransformerEngine only supports softmax compute in FP32". Need to add `--attention-softmax-in-fp32` to the model ar…
-
I was trying to reproducing the equivalent result from the original paper.
Though it is natural to pretrain encoder of model on ImageNet, the decoder is task-specifically trained.
As of I know, si…
-
Thanks for your great jobs! I noticed your paper mentioned "We use a processed subset containing 456K molecules from the ChEMBL database [24] for pretraining." Could you please release your pretrainin…
-
Thanks for this repo and congratulations on the superb paper.
Would it be possible to share the pre training code? Specifically loading the model and fine-tuning the DINO pretraining step further?
…
-
In continuation to https://github.com/huggingface/nanotron/issues/78#issue-2147747937,
I converted the weights as you mentioned, but unfortunately, I cannot get the same sane outputs for the pre-tr…
-
What an excellent work! Could you please share the GPU requirement (number and memory) for pretraining and instruction tuning? Thanks.
-
You mentioned in the previous issue we can load pretrained and convert conv2d to partialconv. How would you change it as the model structure is fixed in pretrained models? My model is
```
`class …
-
請問訓練 (pretraining & fine-tuning)用的程式碼是用哪一套?
Axolotl? Llama-factory? 或是其它呢? (huggingface_trl 似乎不支援 pretraining)
可否分享訓練用的設定檔案? 謝謝!