panzheyi / ST-MetaNet

The codes and data of paper "Urban Traffic Prediction from Spatio-Temporal Data Using Deep Meta Learning"
MIT License
194 stars 73 forks source link

Running error with sudden cancellation #12

Open Suppersine opened 3 years ago

Suppersine commented 3 years ago

I am deploying your model on Colab with several adaptations, whose link is here: https://colab.research.google.com/drive/1BZ9PFWz-61KiKB91ET93iJkm35lklCr-?usp=sharing

The model ran smoothly with Mxnet 1.4.0 and 1.5.1, but end up with a sudden cancellation (possibly due to the RAM surge). Successfully loading the model st-metanet [epoch: 131] seq2seq_ ( Parameter seq2seq_encoder_c0_gru0_i2h_weight (shape=(192, 3), dtype=<class 'numpy.float32'>) Parameter seq2seq_encoder_c0_gru0_h2h_weight (shape=(192, 64), dtype=<class 'numpy.float32'>) Parameter seq2seq_encoder_c0_gru0_i2h_bias (shape=(192,), dtype=<class 'numpy.float32'>) Parameter seq2seq_encoder_c0_gru0_h2h_bias (shape=(192,), dtype=<class 'numpy.float32'>) Parameter seq2seq_encoder_c1_dense_z_w_dense0_weight (shape=(16, 32), dtype=float32) Parameter seq2seq_encoder_c1_dense_z_w_dense0_bias (shape=(16,), dtype=float32) Parameter seq2seq_encoder_c1_dense_z_w_dense1_weight (shape=(2, 16), dtype=float32) Parameter seq2seq_encoder_c1_dense_z_w_dense1_bias (shape=(2,), dtype=float32) Parameter seq2seq_encoder_c1_dense_z_w_dense2_weight (shape=(8192, 2), dtype=float32) Parameter seq2seq_encoder_c1_dense_z_w_dense2_bias (shape=(8192,), dtype=float32) Parameter seq2seq_encoder_c1_dense_z_b_dense0_weight (shape=(16, 32), dtype=float32) Parameter seq2seq_encoder_c1_dense_z_b_dense0_bias (shape=(16,), dtype=float32) Parameter seq2seq_encoder_c1_dense_z_b_dense1_weight (shape=(2, 16), dtype=float32) Parameter seq2seq_encoder_c1_dense_z_b_dense1_bias (shape=(2,), dtype=float32) Parameter seq2seq_encoder_c1_dense_z_b_dense2_weight (shape=(1, 2), dtype=float32) Parameter seq2seq_encoder_c1_dense_z_b_dense2_bias (shape=(1,), dtype=float32) Parameter seq2seq_encoder_c1_dense_r_w_dense0_weight (shape=(16, 32), dtype=float32) Parameter seq2seq_encoder_c1_dense_r_w_dense0_bias (shape=(16,), dtype=float32) Parameter seq2seq_encoder_c1_dense_r_w_dense1_weight (shape=(2, 16), dtype=float32) Parameter seq2seq_encoder_c1_dense_r_w_dense1_bias (shape=(2,), dtype=float32) Parameter seq2seq_encoder_c1_dense_r_w_dense2_weight (shape=(8192, 2), dtype=float32) Parameter seq2seq_encoder_c1_dense_r_w_dense2_bias (shape=(8192,), dtype=float32) Parameter seq2seq_encoder_c1_dense_r_b_dense0_weight (shape=(16, 32), dtype=float32) Parameter seq2seq_encoder_c1_dense_r_b_dense0_bias (shape=(16,), dtype=float32) Parameter seq2seq_encoder_c1_dense_r_b_dense1_weight (shape=(2, 16), dtype=float32) Parameter seq2seq_encoder_c1_dense_r_b_dense1_bias (shape=(2,), dtype=float32) Parameter seq2seq_encoder_c1_dense_r_b_dense2_weight (shape=(1, 2), dtype=float32) Parameter seq2seq_encoder_c1_dense_r_b_dense2_bias (shape=(1,), dtype=float32) Parameter seq2seq_encoder_c1_dense_i2h_w_dense0_weight (shape=(16, 32), dtype=float32) Parameter seq2seq_encoder_c1_dense_i2h_w_dense0_bias (shape=(16,), dtype=float32) Parameter seq2seq_encoder_c1_dense_i2h_w_dense1_weight (shape=(2, 16), dtype=float32) Parameter seq2seq_encoder_c1_dense_i2h_w_dense1_bias (shape=(2,), dtype=float32) Parameter seq2seq_encoder_c1_dense_i2h_w_dense2_weight (shape=(4096, 2), dtype=float32) Parameter seq2seq_encoder_c1_dense_i2h_w_dense2_bias (shape=(4096,), dtype=float32) Parameter seq2seq_encoder_c1_dense_i2h_b_dense0_weight (shape=(16, 32), dtype=float32) Parameter seq2seq_encoder_c1_dense_i2h_b_dense0_bias (shape=(16,), dtype=float32) Parameter seq2seq_encoder_c1_dense_i2h_b_dense1_weight (shape=(2, 16), dtype=float32) Parameter seq2seq_encoder_c1_dense_i2h_b_dense1_bias (shape=(2,), dtype=float32) Parameter seq2seq_encoder_c1_dense_i2h_b_dense2_weight (shape=(1, 2), dtype=float32) Parameter seq2seq_encoder_c1_dense_i2h_b_dense2_bias (shape=(1,), dtype=float32) Parameter seq2seq_encoder_c1_dense_h2h_w_dense0_weight (shape=(16, 32), dtype=float32) Parameter seq2seq_encoder_c1_dense_h2h_w_dense0_bias (shape=(16,), dtype=float32) Parameter seq2seq_encoder_c1_dense_h2h_w_dense1_weight (shape=(2, 16), dtype=float32) Parameter seq2seq_encoder_c1_dense_h2h_w_dense1_bias (shape=(2,), dtype=float32) Parameter seq2seq_encoder_c1_dense_h2h_w_dense2_weight (shape=(4096, 2), dtype=float32) Parameter seq2seq_encoder_c1_dense_h2h_w_dense2_bias (shape=(4096,), dtype=float32) Parameter seq2seq_encoder_c1_dense_h2h_b_dense0_weight (shape=(16, 32), dtype=float32) Parameter seq2seq_encoder_c1_dense_h2h_b_dense0_bias (shape=(16,), dtype=float32) Parameter seq2seq_encoder_c1_dense_h2h_b_dense1_weight (shape=(2, 16), dtype=float32) Parameter seq2seq_encoder_c1_dense_h2h_b_dense1_bias (shape=(2,), dtype=float32) Parameter seq2seq_encoder_c1_dense_h2h_b_dense2_weight (shape=(1, 2), dtype=float32) Parameter seq2seq_encoder_c1_dense_h2h_b_dense2_bias (shape=(1,), dtype=float32) Parameter seq2seq_encoder_g0_graph_weight (shape=(1, 1), dtype=<class 'numpy.float32'>) Parameter seq2seq_encoder_g0_graph_mlp0_dense0_weight (shape=(16, 96), dtype=float32) Parameter seq2seq_encoder_g0_graph_mlp0_dense0_bias (shape=(16,), dtype=float32) Parameter seq2seq_encoder_g0_graph_mlp0_dense1_weight (shape=(2, 16), dtype=float32) Parameter seq2seq_encoder_g0_graph_mlp0_dense1_bias (shape=(2,), dtype=float32) Parameter seq2seq_encoder_g0_graph_mlp0_dense2_weight (shape=(8192, 2), dtype=float32) Parameter seq2seq_encoder_g0_graph_mlp0_dense2_bias (shape=(8192,), dtype=float32) Parameter seq2seq_decoder_c0_gru0_i2h_weight (shape=(192, 3), dtype=<class 'numpy.float32'>) Parameter seq2seq_decoder_c0_gru0_h2h_weight (shape=(192, 64), dtype=<class 'numpy.float32'>) Parameter seq2seq_decoder_c0_gru0_i2h_bias (shape=(192,), dtype=<class 'numpy.float32'>) Parameter seq2seq_decoder_c0_gru0_h2h_bias (shape=(192,), dtype=<class 'numpy.float32'>) Parameter seq2seq_decoder_c1_dense_z_w_dense0_weight (shape=(16, 32), dtype=float32) Parameter seq2seq_decoder_c1_dense_z_w_dense0_bias (shape=(16,), dtype=float32) Parameter seq2seq_decoder_c1_dense_z_w_dense1_weight (shape=(2, 16), dtype=float32) Parameter seq2seq_decoder_c1_dense_z_w_dense1_bias (shape=(2,), dtype=float32) Parameter seq2seq_decoder_c1_dense_z_w_dense2_weight (shape=(8192, 2), dtype=float32) Parameter seq2seq_decoder_c1_dense_z_w_dense2_bias (shape=(8192,), dtype=float32) Parameter seq2seq_decoder_c1_dense_z_b_dense0_weight (shape=(16, 32), dtype=float32) Parameter seq2seq_decoder_c1_dense_z_b_dense0_bias (shape=(16,), dtype=float32) Parameter seq2seq_decoder_c1_dense_z_b_dense1_weight (shape=(2, 16), dtype=float32) Parameter seq2seq_decoder_c1_dense_z_b_dense1_bias (shape=(2,), dtype=float32) Parameter seq2seq_decoder_c1_dense_z_b_dense2_weight (shape=(1, 2), dtype=float32) Parameter seq2seq_decoder_c1_dense_z_b_dense2_bias (shape=(1,), dtype=float32) Parameter seq2seq_decoder_c1_dense_r_w_dense0_weight (shape=(16, 32), dtype=float32) Parameter seq2seq_decoder_c1_dense_r_w_dense0_bias (shape=(16,), dtype=float32) Parameter seq2seq_decoder_c1_dense_r_w_dense1_weight (shape=(2, 16), dtype=float32) Parameter seq2seq_decoder_c1_dense_r_w_dense1_bias (shape=(2,), dtype=float32) Parameter seq2seq_decoder_c1_dense_r_w_dense2_weight (shape=(8192, 2), dtype=float32) Parameter seq2seq_decoder_c1_dense_r_w_dense2_bias (shape=(8192,), dtype=float32) Parameter seq2seq_decoder_c1_dense_r_b_dense0_weight (shape=(16, 32), dtype=float32) Parameter seq2seq_decoder_c1_dense_r_b_dense0_bias (shape=(16,), dtype=float32) Parameter seq2seq_decoder_c1_dense_r_b_dense1_weight (shape=(2, 16), dtype=float32) Parameter seq2seq_decoder_c1_dense_r_b_dense1_bias (shape=(2,), dtype=float32) Parameter seq2seq_decoder_c1_dense_r_b_dense2_weight (shape=(1, 2), dtype=float32) Parameter seq2seq_decoder_c1_dense_r_b_dense2_bias (shape=(1,), dtype=float32) Parameter seq2seq_decoder_c1_dense_i2h_w_dense0_weight (shape=(16, 32), dtype=float32) Parameter seq2seq_decoder_c1_dense_i2h_w_dense0_bias (shape=(16,), dtype=float32) Parameter seq2seq_decoder_c1_dense_i2h_w_dense1_weight (shape=(2, 16), dtype=float32) Parameter seq2seq_decoder_c1_dense_i2h_w_dense1_bias (shape=(2,), dtype=float32) Parameter seq2seq_decoder_c1_dense_i2h_w_dense2_weight (shape=(4096, 2), dtype=float32) Parameter seq2seq_decoder_c1_dense_i2h_w_dense2_bias (shape=(4096,), dtype=float32) Parameter seq2seq_decoder_c1_dense_i2h_b_dense0_weight (shape=(16, 32), dtype=float32) Parameter seq2seq_decoder_c1_dense_i2h_b_dense0_bias (shape=(16,), dtype=float32) Parameter seq2seq_decoder_c1_dense_i2h_b_dense1_weight (shape=(2, 16), dtype=float32) Parameter seq2seq_decoder_c1_dense_i2h_b_dense1_bias (shape=(2,), dtype=float32) Parameter seq2seq_decoder_c1_dense_i2h_b_dense2_weight (shape=(1, 2), dtype=float32) Parameter seq2seq_decoder_c1_dense_i2h_b_dense2_bias (shape=(1,), dtype=float32) Parameter seq2seq_decoder_c1_dense_h2h_w_dense0_weight (shape=(16, 32), dtype=float32) Parameter seq2seq_decoder_c1_dense_h2h_w_dense0_bias (shape=(16,), dtype=float32) Parameter seq2seq_decoder_c1_dense_h2h_w_dense1_weight (shape=(2, 16), dtype=float32) Parameter seq2seq_decoder_c1_dense_h2h_w_dense1_bias (shape=(2,), dtype=float32) Parameter seq2seq_decoder_c1_dense_h2h_w_dense2_weight (shape=(4096, 2), dtype=float32) Parameter seq2seq_decoder_c1_dense_h2h_w_dense2_bias (shape=(4096,), dtype=float32) Parameter seq2seq_decoder_c1_dense_h2h_b_dense0_weight (shape=(16, 32), dtype=float32) Parameter seq2seq_decoder_c1_dense_h2h_b_dense0_bias (shape=(16,), dtype=float32) Parameter seq2seq_decoder_c1_dense_h2h_b_dense1_weight (shape=(2, 16), dtype=float32) Parameter seq2seq_decoder_c1_dense_h2h_b_dense1_bias (shape=(2,), dtype=float32) Parameter seq2seq_decoder_c1_dense_h2h_b_dense2_weight (shape=(1, 2), dtype=float32) Parameter seq2seq_decoder_c1_dense_h2h_b_dense2_bias (shape=(1,), dtype=float32) Parameter seq2seq_decoder_g0_graph_weight (shape=(1, 1), dtype=<class 'numpy.float32'>) Parameter seq2seq_decoder_g0_graph_mlp0_dense0_weight (shape=(16, 96), dtype=float32) Parameter seq2seq_decoder_g0_graph_mlp0_dense0_bias (shape=(16,), dtype=float32) Parameter seq2seq_decoder_g0_graph_mlp0_dense1_weight (shape=(2, 16), dtype=float32) Parameter seq2seq_decoder_g0_graph_mlp0_dense1_bias (shape=(2,), dtype=float32) Parameter seq2seq_decoder_g0_graph_mlp0_dense2_weight (shape=(8192, 2), dtype=float32) Parameter seq2seq_decoder_g0_graph_mlp0_dense2_bias (shape=(8192,), dtype=float32) Parameter decoder0_proj_weight (shape=(2, 96), dtype=float32) Parameter decoder0_proj_bias (shape=(2,), dtype=float32) Parameter geo_encoder_dense0_weight (shape=(32, 989), dtype=float32) Parameter geo_encoder_dense0_bias (shape=(32,), dtype=float32) Parameter geo_encoder_dense1_weight (shape=(32, 32), dtype=float32) Parameter geo_encoder_dense1_bias (shape=(32,), dtype=float32) ) NUMBER OF PARAMS: 268224 INFO:root:Processing 1000 timestamps INFO:root:Processing 2000 timestamps tcmalloc: large alloc 9179439104 bytes == 0x55fd96a46000 @ 0x7f54362f31e7 0x7f5433000cf1 0x7f5433065768 0x7f5433065883 0x7f5433105b5e 0x7f54331063c4 0x7f5433106512 0x55fcee9530a4 0x55fcee952da0 0x55fcee9c7868 0x55fcee9c2235 0x55fcee95473a 0x55fcee9c6f40 0x55fcee9c1c35 0x55fcee95473a 0x55fcee9c2b0e 0x55fcee95465a 0x55fcee9c2b0e 0x55fcee95465a 0x55fcee9c2b0e 0x55fcee9c1c35 0x55fcee9c1933 0x55fceea8b402 0x55fceea8b77d 0x55fceea8b626 0x55fceea63313 0x55fceea62fbc 0x7f54350ddbf7 0x55fceea62e9a tcmalloc: large alloc 9179439104 bytes == 0x55ffba478000 @ 0x7f54362d5b6b 0x7f54362f5379 0x7f54064bde75 0x7f54064bdf0d 0x7f54064c622e 0x7f5405cdaa26 0x7f5405cdb298 0x7f5433825dae 0x7f543382571f 0x7f5433a395ac 0x7f5433a389e3 0x55fcee9537b2 0x55fcee9c76f2 0x55fcee9c1c35 0x55fcee95473a 0x55fcee9c2b0e 0x55fcee9c1c35 0x55fcee95473a 0x55fcee9c2b0e 0x55fcee9c1c35 0x55fcee95473a 0x55fcee9c393b 0x55fcee9c1c35 0x55fcee95473a 0x55fcee9c6f40 0x55fcee9c1c35 0x55fcee95473a 0x55fcee9c2b0e 0x55fcee95465a 0x55fcee9c2b0e 0x55fcee95465a ^C

panzheyi commented 3 years ago

Hi, it seems that the program crashed because of ​OOM problem when it was loading the dataset. I guess increasing memory could solve it.

Suppersine commented 3 years ago

Hi, it seems that the program crashed because of ​OOM problem when it was loading the dataset. I guess increasing memory could solve it.

How much RAM do I need?

panzheyi commented 3 years ago

Probably 32G is enough. When I executed this program, I used 64G or 128G servers.