-
Hi authors, thank you for this impressive work.
Is it possible to provide a pretraining script and a small sample of the processed data used for pretraining? I would like to try pretraining a model…
-
In continuation to https://github.com/huggingface/nanotron/issues/78#issue-2147747937,
I converted the weights as you mentioned, but unfortunately, I cannot get the same sane outputs for the pre-tr…
-
After i trained,i put the .tar at vit_load_path.But i get the missing key error when i want to segment other data(like this:Missing key(s) in state_dict: "image_encoder.pos_embed", "image_encoder.patc…
-
In the Masked Pretraining section, there seems to be an issue with the way the CLIP model is loaded. In the extract.ipynb notebook, the code model, _ = clip.load("ViT-B/16", device='cpu') is used, but…
-
I would like to inquire about how the few-shot approach is specifically incorporated into your pretraining process. For instance, the paper mentions six different few-shot scenarios with 0, 4, 8, 16, …
-
Do you have a pre-training model? I want to save time on training.
And what about your training hours with the epoch=100.
-
Hi, does the SwinTransformer v2 do SimMiM pretraining? This is shown in the paper:
https://arxiv.org/pdf/2111.09883.pdf
If not, any plans to add/how difficult would it be to port?
-
Thank you for your excellent work. I'm currently training my own CLIP model and have a question. If I use LAION-2B, COYO-700M, and Datacomp datasets simultaneously for training, will it yield better r…
-
Dear GigaPath team,
Thank you for your excellent work!
Could you share how long the pre-training of the tile encoder took? In the paper you mention the time for pretraining the slide-level model b…
-
在对paraformer长音频版模型进行微调之后,保存的pt文件大小由basemodel的800多M增加到了近2.6G,
且在推理同一段wav文件时,会报错,报错信息如下:
Traceback (most recent call last):
File "/wind/aispace/train/source/src/FunASR/examples/industrial_data_pret…