I want to train spanBert from scratch, for contextual span embedding. The architecture is quite large and I think only three layers will suffice for my task. Please help me with how can I reduce the size of the architecture and train it on the objective of spanbert. Thanks
I want to train spanBert from scratch, for contextual span embedding. The architecture is quite large and I think only three layers will suffice for my task. Please help me with how can I reduce the size of the architecture and train it on the objective of spanbert. Thanks