hulianyuyy / SEN_CSLR

Self-Emphasizing Network for Continuous Sign Language Recognition (AAAI2023 Oral)
Apache License 2.0
41 stars 2 forks source link

Train problem #7

Closed xxxiaosong closed 1 year ago

xxxiaosong commented 1 year ago

Hello! The following issues were encountered during the training process. Could you give me some guidance? image

hulianyuyy commented 1 year ago

It seems that you haven't correctly linked the dataset to the required place. You can create a link as suggested in the readme.md.

xxxiaosong commented 1 year ago

Thank you for your reply! The dataset soft link seems to be successful because I can preprocess the dataset.

hulianyuyy commented 1 year ago

According to the information in your screenshot, the model fails to read the sign language images and raises an error "list index out of range". To locate the issue, you can check the type of the input data in the dataloader_video.py by 'print(type(video))' after line 107. You will mostly get a None output. Call me if you have further problems.

xxxiaosong commented 1 year ago

Thank you for your reply! The problem has been resolved. I used an RTX3080 with 10GB memory for training, and in order to prevent OOM, I set the batchsize to 1. However, after the first epoch ended, the following error occurred, and I suspect it was OOM again. image

hulianyuyy commented 1 year ago

It seems that you don't have enough space on the disk.

---Original--- From: @.> Date: Wed, Jun 28, 2023 10:52 AM To: @.>; Cc: @.**@.>; Subject: Re: [hulianyuyy/SEN_CSLR] Train problem (Issue #7)

Thank you for your reply! The problem has been resolved. I used an RTX3080 with 10GB memory for training, and in order to prevent OOM, I set the batchsize to 1. However, after the first epoch ended, the following error occurred, and I suspect it was OOM again.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

xxxiaosong commented 1 year ago

Thank you for your reply! Not due to insufficient disk space. The problem has been resolved as follows: https://github.com/parlance/ctcdecode/issues/124

hulianyuyy commented 1 year ago

Many thanks. Hope you can enjoy the work!

---Original--- From: @.> Date: Thu, Jun 29, 2023 17:27 PM To: @.>; Cc: @.**@.>; Subject: Re: [hulianyuyy/SEN_CSLR] Train problem (Issue #7)

Thank you for your reply! Not due to insufficient disk space. The problem has been resolved as follows: https://github.com/parlance/ctcdecode/issues/124

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

xxxiaosong commented 1 year ago

Hello! I am here again. I have replaced a new device to run this project. But I have encountered some inexplicable errors. The error message is as follows. image image Could you give me some guidance?

hulianyuyy commented 1 year ago

I don't have exact answers, but i figure that this may be attributed to that you may set a different number of classes with the target datasets? This issus may be related with the number of classes.

xxxiaosong commented 1 year ago

Thank you for your reply. The problem has been resolved. This is the problem with sclite. I have another question, can I execute the following command to continue training with the saved weights when my training process unexpectedly terminates.

python main.py --load-weights work_dir/baseline_res18/dev_23.60_epoch15_model.pt

hulianyuyy commented 1 year ago

You may use "python main.py --load-checkpoints  work_dir/baseline_res18/dev_23.60_epoch15_model.pt"

---Original--- From: @.> Date: Fri, Jul 21, 2023 14:23 PM To: @.>; Cc: @.**@.>; Subject: Re: [hulianyuyy/SEN_CSLR] Train problem (Issue #7)

Thank you for your reply. The problem has been resolved. This is the problem with sclite. I have another question, can I execute the following command to continue training with the saved weights when my training process unexpectedly terminates.

python main.py --load-weights work_dir/baseline_res18/dev_23.60_epoch15_model.pt

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

xxxiaosong commented 1 year ago

Haha. Thank you. I used the wrong command.

youthxin commented 11 months ago

Thank you for your reply! Not due to insufficient disk space. The problem has been resolved as follows: https://github.com/parlance/ctcdecode/issues/124

Hello, I have also encountered such a problem. The link is invalid. How was it resolved

hulianyuyy commented 11 months ago

You

Thank you for your reply! Not due to insufficient disk space. The problem has been resolved as follows: https://github.com/parlance/ctcdecode/issues/124

Hello, I have also encountered such a problem. The link is invalid. How was it resolved

You may refer to this issue and this.

youthxin commented 11 months ago

You

Thank you for your reply! Not due to insufficient disk space. The problem has been resolved as follows: https://github.com/parlance/ctcdecode/issues/124

Hello, I have also encountered such a problem. The link is invalid. How was it resolved

You may refer to this issue and this.

Thank you very much for taking the time to reply to me. I will try it out