KhalilMrini / LAL-Parser

Neural Adobe-UCSD Parser, the current State of the Art in Constituency and Dependency Parsing.
138 stars 24 forks source link

how much cuda memeory the model needs when training ? #13

Closed wangwang110 closed 4 years ago

wangwang110 commented 4 years ago

how much cuda memeory the model needs when training ? and does it support multi-gpu training ?

KhalilMrini commented 4 years ago

Hi, the best model was trained on a single 32GB GPU. I tried developing multi-gpu training, but PyTorch is not geared towards that.

Franck-Dernoncourt commented 4 years ago

The paper "Rethinking Self-Attention: Towards Interpretability in Neural Parsing" (Mrini et al., 2020) mentions:

Each English experiment is performed on a single 32GB GPU, while each Chinese experiment is performed on a single 12GB GPU.

Why does the GPU memory requirement differ between the Chinese experiment and English experiment?