tech-srl / code2seq

Code for the model presented in the paper: "code2seq: Generating Sequences from Structured Representations of Code"
http://code2seq.org
MIT License
555 stars 164 forks source link

Problem with the Beam_width config #95

Closed DRMALEK closed 3 years ago

DRMALEK commented 3 years ago

Hi,

After I changed the beam width to 3 due to the accuracy being low on the Funcom dataset, İ started to have the following error

File "code2seq.py", line 29, in <module> model.train() File "/data/malekbaba_data/codes/code2seq_modified/modelrunner.py", line 167, in train results, precision, recall, f1, rouge = self.evaluate(release_test=False) File "/data/malekbaba_data/codes/code2seq_modified/modelrunner.py", line 238, in evaluate outputs, final_states = self.model.run_decoder(batched_contexts, input_tensors, is_training=False) File "/data/malekbaba_data/codes/code2seq_modified/model.py", line 134, in run_decoder is_training=is_training) File "/data/malekbaba_data/codes/code2seq_modified/model.py", line 206, in decode_outputs contexts_sum = tf.reduce_sum(batched_contexts * tf.expand_dims(valid_mask, -1), File "/data/malekbaba_data/codes/code2seq_modified/venv/lib64/python3.6/site-packages/tensorflow_core/python/ops/math_ops.py", line 902, in binary_op_wrapp$ return func(x, y, name=name) File "/data/malekbaba_data/codes/code2seq_modified/venv/lib64/python3.6/site-packages/tensorflow_core/python/ops/math_ops.py", line 1201, in _mul_dispatch return gen_math_ops.mul(x, y, name=name) File "/data/malekbaba_data/codes/code2seq_modified/venv/lib64/python3.6/site-packages/tensorflow_core/python/ops/gen_math_ops.py", line 6122, in mul _ops.raise_from_not_ok_status(e, name) File "/data/malekbaba_data/codes/code2seq_modified/venv/lib64/python3.6/site-packages/tensorflow_core/python/framework/ops.py", line 6606, in raise_from_no$ six.raise_from(core._status_to_exception(e.code, message), None) File "<string>", line 3, in raise_from tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [384,100,512] vs. [128,100,1] [Op:Mul] name: mul/


My config file is as follow:

    config.NUM_EPOCHS = 3000 
    config.SAVE_EVERY_EPOCHS = 1 
    config.PATIENCE = 10
    config.BATCH_SIZE = 128
    config.READER_NUM_PARALLEL_BATCHES = 16
    config.SHUFFLE_BUFFER_SIZE = 10000
    config.CSV_BUFFER_SIZE = 100 * 1024 * 1024  # 100 MB
    config.MAX_CONTEXTS = 100
    config.SUBTOKENS_VOCAB_MAX_SIZE = 190000
    config.TARGET_VOCAB_MAX_SIZE = 27000
    config.EMBEDDINGS_SIZE = 128 * 4
    config.RNN_SIZE = 128 * 4  # Two LSTMs to embed paths, each of size 128
    config.DECODER_SIZE = 512
    config.NUM_DECODER_LAYERS = 2
    config.MAX_PATH_LENGTH = 8 + 1
    config.MAX_NAME_PARTS = 5             # Maximun subtoken in a token
    config.MAX_TARGET_PARTS = 30        # Maximum num of words in a comment (30 in deepcom paper)
    config.EMBEDDINGS_DROPOUT_KEEP_PROB = 0.75
    config.RNN_DROPOUT_KEEP_PROB = 0.5
    config.BIRNN = True
    config.RANDOM_CONTEXTS = True
    config.BEAM_WIDTH = 3
    config.USE_MOMENTUM = False 
urialon commented 3 years ago

Hi @DRMALEK , Does this config work when you use our datasets and models, without modifications?

Uri

DRMALEK commented 3 years ago

Actually, I trained your model on the Deepcom dataset and everything goes fine (but without beam_width change), so you are right I should retrain it first with the beath_width change there, then try to train it on the Funcom dataset. Sorry for my beginner kind of mistakes!

urialon commented 3 years ago

Hmmm, that shouldn't matter - beam search is only applied at test time. So it doesn't matter if you didn't train it with beam search. You can try our trained model and our dataset and see whether beam search works.

On Tue, May 11, 2021 at 2:42 PM Malek Baba @.***> wrote:

Actually, I trained your model on the Deepcom dataset and everything goes fine (but without beam_width change), so you are right I should retrain it first with the beath_width change there, then try to train it on the Funcom dataset. Sorry for my beginner kind of mistakes!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/tech-srl/code2seq/issues/95#issuecomment-838325333, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSOXMDX2RZX2EGINLVKWSTTNEJZDANCNFSM44H3OWQQ .

DRMALEK commented 3 years ago

As far as I understood the beam width config is used in the training phase, however not in the original implementation https://github.com/tech-srl/code2seq/, but in the https://github.com/Kolkir/code2seq implementation. Unfortunately, the issues section is not activated there :disappointed:

urialon commented 3 years ago

Which implementation are you using?

On Tue, 11 May 2021 at 18:47 Malek Baba @.***> wrote:

As far as I understood the beam width config is used in the training phase, however not in the original implementation https://github.com/tech-srl/code2seq/, but in the https://github.com/Kolkir/code2seq implementation. Unfortunately, the issues section is not activated there 😞

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/tech-srl/code2seq/issues/95#issuecomment-838717100, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSOXMEM56WOBXYPTDBTHRTTNFGPRANCNFSM44H3OWQQ .

DRMALEK commented 3 years ago

https://github.com/Kolkir/code2seq this implementaion

urialon commented 3 years ago

Oh, this is a good implementation that converted my work to TF2, but unfortunately I cannot provide support for errors in their repository.

On Tue, 11 May 2021 at 18:55 Malek Baba @.***> wrote:

https://github.com/Kolkir/code2seq this implementaion

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/tech-srl/code2seq/issues/95#issuecomment-838732135, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSOXMASLQSFUOXJ6W65D33TNFHPRANCNFSM44H3OWQQ .

DRMALEK commented 3 years ago

No problem, I will try to contact the repo author and ask them if they can open the issue section in their repo, and in the main time, I will use your implementation.

Thanks for your understanding.