nanguoshun / LSR

Pytorch Implementation of our ACL 2020 Paper "Reasoning with Latent Structure Refinement for Document-Level Relation Extraction"
127 stars 22 forks source link

Bert-LSR 结果异常 #31

Open Zeyu-Liang opened 3 years ago

Zeyu-Liang commented 3 years ago

@nanguoshun 更新版Bert-LSR 发现跑出的结果发现异常,请问是什么原因呢?感谢您的工作。

Zeyu-Liang commented 3 years ago

| epoch 0 | step 50 | ms/b 2511.34 | train loss 5.877 | NA acc: 0.94 | not NA acc: 0.00 | tot acc: 0.91 | epoch 0 | step 100 | ms/b 2568.21 | train loss 0.406 | NA acc: 0.97 | not NA acc: 0.00 | tot acc: 0.94 | epoch 0 | step 150 | ms/b 2528.33 | train loss 0.403 | NA acc: 0.98 | not NA acc: 0.00 | tot acc: 0.95 | epoch 0 | step 200 | ms/b 2541.94 | train loss 0.396 | NA acc: 0.99 | not NA acc: 0.00 | tot acc: 0.96 | epoch 0 | step 250 | ms/b 2476.36 | train loss 0.411 | NA acc: 0.99 | not NA acc: 0.00 | tot acc: 0.96 | epoch 0 | step 300 | ms/b 2550.52 | train loss 0.386 | NA acc: 0.99 | not NA acc: 0.00 | tot acc: 0.96 | epoch 1 | step 350 | ms/b 2462.27 | train loss 0.399 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 1 | step 400 | ms/b 2482.02 | train loss 0.390 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 1 | step 450 | ms/b 2594.16 | train loss 0.387 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 1 | step 500 | ms/b 2545.52 | train loss 0.392 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 1 | step 550 | ms/b 2510.27 | train loss 0.385 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 1 | step 600 | ms/b 2484.04 | train loss 0.392 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 2 | step 650 | ms/b 2492.20 | train loss 0.374 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 2 | step 700 | ms/b 2492.06 | train loss 0.373 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 2 | step 750 | ms/b 2453.17 | train loss 0.396 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 2 | step 800 | ms/b 2515.16 | train loss 0.376 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 2 | step 850 | ms/b 2465.45 | train loss 0.408 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 2 | step 900 | ms/b 2433.18 | train loss 0.398 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 3 | step 950 | ms/b 2543.93 | train loss 0.386 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 3 | step 1000 | ms/b 2550.19 | train loss 0.383 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 3 | step 1050 | ms/b 2774.69 | train loss 0.385 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 3 | step 1100 | ms/b 2466.55 | train loss 0.385 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 3 | step 1150 | ms/b 2507.94 | train loss 0.397 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 3 | step 1200 | ms/b 2414.12 | train loss 0.390 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 4 | step 1250 | ms/b 2502.15 | train loss 0.402 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 4 | step 1300 | ms/b 2483.49 | train loss 0.382 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 4 | step 1350 | ms/b 2498.14 | train loss 0.380 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 4 | step 1400 | ms/b 2549.41 | train loss 0.377 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 4 | step 1450 | ms/b 2533.72 | train loss 0.385 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 4 | step 1500 | ms/b 2538.75 | train loss 0.392 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 5 | step 1550 | ms/b 2453.19 | train loss 0.382 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 5 | step 1600 | ms/b 2542.53 | train loss 0.388 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 5 | step 1650 | ms/b 2487.09 | train loss 0.385 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 5 | step 1700 | ms/b 2537.71 | train loss 0.385 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 5 | step 1750 | ms/b 2548.65 | train loss 0.370 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 5 | step 1800 | ms/b 2401.13 | train loss 0.401 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 6 | step 1850 | ms/b 2538.25 | train loss 0.385 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 6 | step 1900 | ms/b 2532.69 | train loss 0.382 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 6 | step 1950 | ms/b 2538.84 | train loss 0.387 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 6 | step 2000 | ms/b 2591.51 | train loss 0.368 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 6 | step 2050 | ms/b 2540.94 | train loss 0.388 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 6 | step 2100 | ms/b 2449.79 | train loss 0.387 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 7 | step 2150 | ms/b 2439.32 | train loss 0.373 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 7 | step 2200 | ms/b 2446.64 | train loss 0.372 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 7 | step 2250 | ms/b 2411.52 | train loss 0.371 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 7 | step 2300 | ms/b 2541.08 | train loss 0.387 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 7 | step 2350 | ms/b 2516.03 | train loss 0.383 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 7 | step 2400 | ms/b 2455.05 | train loss 0.385 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 8 | step 2450 | ms/b 2509.16 | train loss 0.397 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 8 | step 2500 | ms/b 2486.06 | train loss 0.379 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 8 | step 2550 | ms/b 2506.90 | train loss 0.374 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 8 | step 2600 | ms/b 2420.65 | train loss 0.397 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 8 | step 2650 | ms/b 2514.13 | train loss 0.388 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 8 | step 2700 | ms/b 2505.45 | train loss 0.371 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 9 | step 2750 | ms/b 2516.60 | train loss 0.396 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 9 | step 2800 | ms/b 2482.15 | train loss 0.387 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 9 | step 2850 | ms/b 2633.90 | train loss 0.373 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 9 | step 2900 | ms/b 2547.70 | train loss 0.392 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 9 | step 2950 | ms/b 2544.95 | train loss 0.381 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 9 | step 3000 | ms/b 2514.13 | train loss 0.383 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 9 | step 3050 | ms/b 2586.53 | train loss 0.379 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 10 | step 3100 | ms/b 2583.68 | train loss 0.380 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 10 | step 3150 | ms/b 2599.92 | train loss 0.405 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 10 | step 3200 | ms/b 2557.08 | train loss 0.390 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 10 | step 3250 | ms/b 2612.56 | train loss 0.371 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 10 | step 3300 | ms/b 2769.04 | train loss 0.382 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 10 | step 3350 | ms/b 2666.34 | train loss 0.380 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 11 | step 3400 | ms/b 2565.08 | train loss 0.379 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 11 | step 3450 | ms/b 2633.72 | train loss 0.372 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 11 | step 3500 | ms/b 2746.64 | train loss 0.391 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 11 | step 3550 | ms/b 2613.45 | train loss 0.384 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 11 | step 3600 | ms/b 2663.32 | train loss 0.382 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 11 | step 3650 | ms/b 2525.16 | train loss 0.405 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 12 | step 3700 | ms/b 2663.08 | train loss 0.382 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 12 | step 3750 | ms/b 2771.10 | train loss 0.383 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 12 | step 3800 | ms/b 2650.00 | train loss 0.389 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 12 | step 3850 | ms/b 2600.85 | train loss 0.386 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 12 | step 3900 | ms/b 2694.03 | train loss 0.368 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 12 | step 3950 | ms/b 2651.42 | train loss 0.380 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 13 | step 4000 | ms/b 2643.70 | train loss 0.385 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 13 | step 4050 | ms/b 2628.64 | train loss 0.394 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 13 | step 4100 | ms/b 2606.33 | train loss 0.390 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 13 | step 4150 | ms/b 2521.79 | train loss 0.394 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 13 | step 4200 | ms/b 2752.30 | train loss 0.374 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 13 | step 4250 | ms/b 2819.76 | train loss 0.367 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 14 | step 4300 | ms/b 2591.43 | train loss 0.387 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 14 | step 4350 | ms/b 2666.80 | train loss 0.389 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 14 | step 4400 | ms/b 2693.22 | train loss 0.386 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 14 | step 4450 | ms/b 2760.62 | train loss 0.381 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 14 | step 4500 | ms/b 2727.32 | train loss 0.366 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 14 | step 4550 | ms/b 2629.03 | train loss 0.392 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 15 | step 4600 | ms/b 2662.55 | train loss 0.378 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 15 | step 4650 | ms/b 2617.05 | train loss 0.387 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 15 | step 4700 | ms/b 2598.81 | train loss 0.379 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 15 | step 4750 | ms/b 2529.91 | train loss 0.397 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 15 | step 4800 | ms/b 2712.70 | train loss 0.405 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 15 | step 4850 | ms/b 2676.14 | train loss 0.388 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 16 | step 4900 | ms/b 2732.74 | train loss 0.373 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 16 | step 4950 | ms/b 2750.27 | train loss 0.379 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 16 | step 5000 | ms/b 2618.98 | train loss 0.399 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 16 | step 5050 | ms/b 2756.20 | train loss 0.370 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 16 | step 5100 | ms/b 2676.40 | train loss 0.390 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 16 | step 5150 | ms/b 2546.86 | train loss 0.384 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 17 | step 5200 | ms/b 2693.64 | train loss 0.380 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 17 | step 5250 | ms/b 2665.40 | train loss 0.391 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 17 | step 5300 | ms/b 2625.90 | train loss 0.374 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97 | epoch 17 | step 5350 | ms/b 2591.55 | train loss 0.374 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97

Zeyu-Liang commented 3 years ago

所有的参数都跟更新版的一致,batch size = 10

nanguoshun commented 3 years ago

谢谢 @Zeyu-Liang 反馈,我们尽快检查一下然后回复你,谢谢!

nanguoshun commented 3 years ago

@Zeyu-Liang we have fixed the issue and now you can try to reproduce the results on BERT. Noted that Bz =20 (for BERT, 15+ batch size is suggested for finetuning), hidden_size = 216 (should be divisible by 12 due to the constraint of the hyperparameter of GCN layers). Totally you may need about 50GB GPU memory with the setting for BERT-base. Thanks a lot!

MingYangi commented 3 years ago

麻烦问一下 浮现的时候又遇到RuntimeError: 'lengths' argument should be a 1D CPU int64 tensor, but got 1D cuda:0 Long tensor这问题么

stvhuang commented 3 years ago

@SeaYM Try downgrade your PyTorch version to 1.6.0.