evalaute XLNetNER using new get_score.py

honghanhh / ner-combining-contextual-and-global-features

[ICADL] Named entity recognition architecture combining contextual and global features

12 stars 3 forks source link

evalaute XLNetNER using new get_score.py #3

Closed jgmorenof closed 4 years ago

jgmorenof commented 4 years ago

Please evaluate the performances of XLNetNER using the corrected version of get_score.py

There are some differences between the integrated evaluator in XLNetNER and get_score.py. The latter is preferred as it uses the official evaluator script.

honghanhh commented 4 years ago

Dear Prof. @jgmorenof, Before evaluating the XLNetNER, I still got the issue with memory.

Traceback (most recent call last):
  File "train.py", line 164, in <module>
    train(model, train_iter, optimizer, criterion)
      ...
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py", line 936, in dropout
    else _VF.dropout(input, p, training))
RuntimeError: CUDA out of memory. Tried to allocate 68.00 MiB (GPU 0; 11.17 GiB total capacity; 10.01 GiB already allocated; 58.81 MiB free; 10.74 GiB reserved in total by PyTorch)

It seems like increasing RAM of Collab would not help me when I tried to save the checkpoints of XLNet (Last time I cared about the performance and results so I have not saved them yet). My PC is only able to train and save models with batch size of 8. I wonder if I am eligible to access server of the laboratory now, please? Thank you very much!

nsidere commented 4 years ago

Access to servers is now granted.

honghanhh commented 4 years ago

@nsidere Dear Professors, As I use the newest version of Pytorch, it seems like it does not support NVIDIA version that server currently have.

The NVIDIA driver on your system is too old (found version 9010).
Please update your GPU driver by downloading and installing a new
version from the URL: http://www.nvidia.com/Download/index.aspx
Alternatively, go to: https://pytorch.org to install
a PyTorch version that has been compiled with your version
of the CUDA driver.

So I think about 2 solutions:

Revert to the old version of Pytorch (maybe there will be some bugs with Transformer and require time to solve that).
Upgrade CUDA.

I wonder if is it OK to upgrade CUDA since it may affect other users as well. Thank you very much!

nsidere commented 4 years ago

Dear Hanh, Unfortunately, I am afraid I cannot answer to that. For technical issues, you should send an email to the mailing list (address in email sent by Muzzamil, something like L3i-calcul@....). Superusers of the servers are within this list or maybe some users have a solution.

Thank you

honghanhh commented 4 years ago

@nsidere @jgmorenof It seems like updating NVIDA is out of option (I have asked Prof. Muzzamil). Is it ok for me to revert to the older version of Pytorch, professors? As you mentioned in the previous previous meeting to use the latest one.

nsidere commented 4 years ago

Dear Hanh, do you know which version of CUDA you need ? I understood that you have access to one server. Maybe the good version is installed on another server and we could ask an access to these servers.

honghanhh commented 4 years ago

Dear Prof. @nsidere , Currently I can only access to the server with CUDA 9.0

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176

However, on the Pytorch official website, the newest version seems to require at least CUDA 9.2 (via this link)

I wonder if it is possible for me to access another server with suitable CUDA.

Thank you very much for your support.