tech-srl / code2vec

TensorFlow code for the neural network presented in the paper: "code2vec: Learning Distributed Representations of Code"
https://code2vec.org
MIT License
1.11k stars 286 forks source link

Terminal Closes and Model Folder is empty #61

Closed shubhamrsangle closed 4 years ago

shubhamrsangle commented 4 years ago

Hello,

I tried training model with command source train.sh, but terminal closes as soon as I run this command. models directory is getitng created and one more directory inside it named as type in train.sh also created, but it's empty.

Can you tell me why terminal closes and why those folder is empty. #!/usr/bin/env bash ########################################################### # Change the following values to train a new model. # type: the name of the new model, only affects the saved file name. # dataset: the name of the dataset, as was preprocessed using preprocess.sh # test_data: by default, points to the validation set, since this is the set that # will be evaluated after each training iteration. If you wish to test # on the final (held-out) test set, change 'val' to 'test'. type=cpp dataset_name=AMNOI data_dir=data/${dataset_name} data=${data_dir}/${dataset_name} test_data=${data_dir}/${dataset_name}.val.c2v model_dir=models/${type}

mkdir -p models/${model_dir} set -e python3 -u code2vec.py --data ${data} --test ${test_data} --save ${model_dir}/saved_model data

Image shows that my data folder is not empty and path to it

urialon commented 4 years ago

Hi, Can you redirect stdout and stderr to files so we can see what is written before the terminal closes?

shubhamrsangle commented 4 years ago

I did as you said and got this:

Traceback (most recent call last): File "code2vec.py", line 1, in from vocabularies import VocabType File "/home/shubham/Desktop/Windows/BTP/code2vec-master/vocabularies.py", line 45 self.word_to_index: Dict[str, int] = {} ^ SyntaxError: invalid syntax

I tried running vocabularies.py individually even then I got same issue

shubhamrsangle commented 4 years ago

data.zip This I am attaching data folder, you can extract it and try running it, it won't take time as data is very very small (1-2 code files).

urialon commented 4 years ago

I think that your python version is too old. Can you run python --version?

shubhamrsangle commented 4 years ago

Okk, will try

On Mon 13 Jan, 2020, 8:01 PM Uri Alon, notifications@github.com wrote:

I think that your python version is too old. Can you run python --version?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/tech-srl/code2vec/issues/61?email_source=notifications&email_token=AIY42NGEFJ24O6OBYVCKFIDQ5R3LZA5CNFSM4KF7VEZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIY5FQQ#issuecomment-573690562, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIY42NCWJABVEZFZ6QG3TODQ5R3LZANCNFSM4KF7VEZA .

shubhamrsangle commented 4 years ago

Issue solved by changing tensorflow version to 2.0.0 from 2.1.0

urialon commented 4 years ago

Great, I'm glad to hear!

shubhamrsangle commented 4 years ago

Thanks for your help, it will be great if you add this in Readme that Python version should be greater than or equal to 3.6.

Also it will be great if you could help me through this.

As this is mentioned in Readme that If used with the --test flag, a file named .vectors will be saved in the same directory as . Each row in the saved file is the code vector of the code snipped in the corresponding row in .

Now, I trained model for cpp so can be any cpp file right? or by default it's also taking java file?

when I run this command

python3 code2vec.py --load models/cpp/saved_model.release --export_code_vectors --test AMNOI.cpp

I am getting error

Traceback (most recent call last): File "/home/shubham/.local/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call return fn(*args) File "/home/shubham/.local/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn target_list, run_metadata) File "/home/shubham/.local/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InvalidArgumentError: Expect 201 fields but have 2 in record [[{{node IteratorGetNext}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "code2vec.py", line 31, in eval_results = model.evaluate() File "/home/shubham/Desktop/Windows/BTP/code2vec-master/tensorflow_model.py", line 159, in evaluate self.eval_original_names_op, self.eval_code_vectors], File "/home/shubham/.local/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 956, in run run_metadata_ptr) File "/home/shubham/.local/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1180, in _run feed_dict_tensor, options, run_metadata) File "/home/shubham/.local/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1359, in _do_run run_metadata) File "/home/shubham/.local/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1384, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: Expect 201 fields but have 2 in record [[node IteratorGetNext (defined at /home/shubham/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py:1751) ]]

Original stack trace for 'IteratorGetNext': File "code2vec.py", line 31, in eval_results = model.evaluate() File "/home/shubham/Desktop/Windows/BTP/code2vec-master/tensorflow_model.py", line 122, in evaluate input_tensors = input_iterator.get_next() File "/home/shubham/.local/lib/python3.7/site-packages/tensorflow_core/python/data/ops/iterator_ops.py", line 426, in get_next name=name) File "/home/shubham/.local/lib/python3.7/site-packages/tensorflow_core/python/ops/gen_dataset_ops.py", line 2500, in iterator_get_next output_shapes=output_shapes, name=name) File "/home/shubham/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/op_def_library.py", line 793, in _apply_op_helper op_def=op_def) File "/home/shubham/.local/lib/python3.7/site-packages/tensorflow_core/python/util/deprecation.py", line 507, in new_func return func(*args, **kwargs) File "/home/shubham/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 3360, in create_op attrs, op_def, compute_device) File "/home/shubham/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 3429, in _create_op_internal op_def=op_def) File "/home/shubham/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1751, in init self._traceback = tf_stack.extract_stack()

and when I am running this command

python3 code2vec.py --load models/cpp/saved_model.release --export_code_vectors --predict

It's by default opening input.java but I trained model for CPP