tech-srl / code2vec

TensorFlow code for the neural network presented in the paper: "code2vec: Learning Distributed Representations of Code"
https://code2vec.org
MIT License
1.1k stars 286 forks source link

ZeroDivisionError while preprocessing the java-small db #106

Closed HardikPrabhu closed 3 years ago

HardikPrabhu commented 3 years ago

I am running into the following error while preprocessing on the java-small dataset.

Error: Creating histograms from the training data 2021-01-07 03:43:55.282291: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found 2021-01-07 03:43:55.283457: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. File: my_dataset.test.raw.txt Traceback (most recent call last): File "preprocess.py", line 133, in num_examples = process_file(file_path=data_file_path, data_file_role=data_role, dataset_name=args.output_name, File "preprocess.py", line 69, in process_file print('Average total contexts: ' + str(float(sum_total) / total)) ZeroDivisionError: float division by zero

urialon commented 3 years ago

Hi @HardikPrabhu , Thank you for your interest in code2vec!

It looks like you have a problem with CUDA and TensorFlow. Before preprocessing, please make sure that you can run the following command without errors (warnings are OK):

python3 -c 'import tensorflow as tf; print(tf.__version__)'
gOATiful commented 3 years ago

Hey hey, had the same issue by creating a custom data set. In my case, the error is caused by empty *.raw.txt files created by JavaExtractor/extract.py. I tracked it down to the wrong TMP_DIRs beeing used.

Some clean up of the code is needed, but afterwards i will create a MR for it.

Greez

urialon commented 3 years ago

Closing due to inactivity, feel free to re-open.

spr593 commented 3 years ago

I had the same empty preprocessed file problem while running code2vec in mac Big Sur M1 chip and tensorflow 2.4 running on a conda environment; and the proposed pull-request by Greez-Torge fixed it greatly! Thank you SO much!