-
@yuanheTian friendly ping
How much GPU GBs needed for BERT?
for XLnet?
I'm talking about training not inference.
-
python DataLoadAndTrain.py --LOSS_alpha=1 --lr=1e-5 --l2=1e-5 --early_stop=5 --PreTrain_Model="Gpt2" --batch_size=16
2023-03-01 10:52:17.583697: W tensorflow/stream_executor/platform/default/dso_load…
-
I was trying to create Databunch on Google Colab, using the sentiments140 twitter dataset from google colab. But no matter what batch size I use the GPU always crashes. I tried all batch sizes from 2 …
-
Recent transformers architectures are very famous in NLP: BERT, GPT-2, RoBERTa, XLNET. Did you try to fine-tune them on some NLP task? If so, what was the best Ranger hyper-parameters and learning rat…
-
chinese-xlnet-base
-
I want to try to use the existing xlnet_sst as the attacked model. Unfortunately, it keep reporting errors. Can you check and try this example?thanks!
![image](https://user-images.githubusercontent.c…
-
https://github.com/zihangdai/xlnet/blob/master/data_utils.py#L316
I wonder why this is `b_begin: b_end + 1`, not `b_begin+1: b_end + 1`.
Also, what is mean?
https://github.com/zihangdai/xlnet…
-
您好,请问在运行xlnet_hierarchical_attn模型时出现TypeError: linear(): argument 'input' (position 1) must be Tensor, not str怎么解决?
wwhss updated
10 months ago
-
Here the model generation shows how to convert gpt2 model specifically to mlmodel. How to apply this to other models like pretrained bert and xlnet? please help.
-
I run the program in pycharm, one error listed below occurs, how to solve it?
ValueError: Unrecognized model in weights/icon_caption_florence. Should have a `model_type` key in its config.json, or co…