google-research / bigbird

Transformers for Longer Sequences
https://arxiv.org/abs/2007.14062
Apache License 2.0
563 stars 101 forks source link

What's the difference of bigbr_base and bigbr_base_tf2 at the gs://bigbird-transformer/pretrain ? #21

Open liuyang148 opened 3 years ago

liuyang148 commented 3 years ago

I found there are two bigbr_base pretrain weights at Google Cloud Storage Bucket, what is the difference? And I have checked that their word embeddings are different by this script, which means they are not only different in the type of tf2/tf1.