Closed mistel1225 closed 3 years ago
Hi,
Thank you so much for your interest!
Hi,
- The import path is not correct, as you pointed out. I updated the code accordly, thank you
- It says it is OOM error. How large is your GPU memory? B-CL has capsule net which needs a large memeory. If you use the setting in the example script, it task around 13G. If your GPU memoery is small, you may want to try decrese the batch size or do parallel training.
Thank you so much for your interest!
I change the CUDA_VISIBLE_DEVICES=1
to CUDA_VISIBLE_DEVICES=0
in ./commands/til_classification/asc/run_train_bert_adapter_capsule_mask_ncl.sh
In my environment, cuda:0 represent RTX3090, which has 24GB VRAM, and cuda:1 represent RTX1080ti, which has 11GB vram, it's weird because I can run it on 1080ti instead of 3090,
my script is as follows:
1 #!/bin/bash
2
3 if [ ! -d "OutputBert" ]; then
4 mkdir OutputBert
5 fi
6
7 for id in 0 1 3
8 do
9 CUDA_VISIBLE_DEVICES=0 python run.py \
10 --bert_model 'bert-base-uncased' \
11 --note random$id\
12 --ntasks 19 \
13 --task asc \
14 --idrandom $id \
15 --output_dir './OutputBert' \
16 --scenario til_classification \
17 --approach bert_adapter_capsule_mask_ncl \
18 --experiment bert \
19 --eval_batch_size 32 \
20 --train_batch_size 16 \
21 --num_train_epochs 10 \
22 --apply_bert_output \
23 --apply_bert_attention_output \
24 --build_adapter_capsule_mask \
25 --apply_one_layer_shared \
26 --xusemeval_num_train_epochs 10 \
27 --bingdomains_num_train_epochs 30 \
28 --bingdomains_num_train_epochs_multiplier 3 \
29 --semantic_cap_size 3 1 #!/bin/bash
2
3 if [ ! -d "OutputBert" ]; then
4 mkdir OutputBert
5 fi
6
7 for id in 0 1 3
8 do
9 CUDA_VISIBLE_DEVICES=0 python run.py \
10 --bert_model 'bert-base-uncased' \
11 --note random$id\
12 --ntasks 19 \
13 --task asc \
14 --idrandom $id \
15 --output_dir './OutputBert' \
16 --scenario til_classification \
17 --approach bert_adapter_capsule_mask_ncl \
18 --experiment bert \
19 --eval_batch_size 32 \
20 --train_batch_size 16 \
21 --num_train_epochs 10 \
22 --apply_bert_output \
23 --apply_bert_attention_output \
24 --build_adapter_capsule_mask \
25 --apply_one_layer_shared \
26 --xusemeval_num_train_epochs 10 \
27 --bingdomains_num_train_epochs 30 \
28 --bingdomains_num_train_epochs_multiplier 3 \
29 --semantic_cap_size 3
30 done
31
32 #TODO: check other number of capsules
33 # --apply_one_layer_shared for 1,0,3
34 # --apply_two_layer_shared for 2,4
30 done
31
32 #TODO: check other number of capsules
33 # --apply_one_layer_shared for 1,0,3
34 # --apply_two_layer_shared for 2,4
I currently train it on 1080ti with 4 batch, what did you think of the performance difference between 16 batch and 4 batch?
sorry, I found that the OOM is our problem because the device is broken, so it's not work, this issue can be closed!
thanks so much!
Hi, I encounter Import error for executing the following script:
./commands/til_classification/asc/run_train_bert_adapter_capsule_mask_ncl.sh
and the error message is as follows:after I edit the import path from
from networks.classification.adapters import BertAdapterCapsuleMask
tofrom networks.base.adapters import BertAdapterCapsuleMask
inmy_transformers.py
the following message occurred:it seems like the second error message is I change the
CUDA_VISIBLE_DEVICES=1
toCUDA_VISIBLE_DEVICES=0
in./commands/til_classification/asc/run_train_bert_adapter_capsule_mask_ncl.sh
however, in my enviroment, cuda:0 is RTX3090 and cuda:1 is RTX1080ti, it's not make sense, and I wandering if it is error from adapters.py.thanks for your patience, best regards.