Closed meaningful96 closed 5 months ago
Sorry for the late response.
If your GPU computing power is insufficient, it is normal to spend this time training due to the full tuning of RoBERTa-large. The FB15k-237 dataset is much bigger than the WN18RR, and thus it will take longer.
hello. I have a question about learning the Star model. now
This process is in progress, and the dataset is using FB15k-237 and WN18RR using different GPUs. However, for both data sets, this process takes too long. is this normal? In the case of FB15k-237, iteration took 50 hours to do train once. In the case of WN18RR, it takes about 15 hours per Iteration. is this normal?