Open 50417 opened 6 years ago
Hi Sohil, sure! I was actually working on a doc file to do exactly that. Unfortunately though I am taking the next three months off my PhD and won't be working on it until I'm back. In the mean time you're welcome to poke around in the code to do it yourself. The process is really quite straightforward:
$ blaze run //deeplearning/clgen -- --config=/path/to/the/config/file
.Let me know how you get on!
Cheers, Chris
Thank You for the quick reply. I will try to experiment with the code tomorrow. Let you know if I have any questions or concern.
No worries :) I'll actually keep this issue open as a reminder to myself, and in case anyone else wants something similar.
Cheers, Chris
Hello Chris, I have been trying to run CLgen on macOS. After debugging for some time, i still am unable to debug it to train the test corpus. I get following error.
`clgen.py 176 ERROR invalid literal for int() with base 10: '' (ValueError)= stacktrace:
1 /private/var/tmp/_bazel_sohilshrestha/f8ee67cead6a3e5516303f5e0dd3d4e7/sandbox/darwin-sandbox/20/execroot/phd/bazel-out/darwin-py3-opt/bin/deeplearning/clgen/clgen_test.runfiles/phd/deeplearning/clgen/corpuses/corpuses.py:112 init()
2 /private/var/tmp/_bazel_sohilshrestha/f8ee67cead6a3e5516303f5e0dd3d4e7/sandbox/darwin-sandbox/20/execroot/phd/bazel-out/darwin-py3-opt/bin/deeplearning/clgen/clgen_test.runfiles/phd/deeplearning/clgen/models/models.py:66 init()
3 /private/var/tmp/_bazel_sohilshrestha/f8ee67cead6a3e5516303f5e0dd3d4e7/sandbox/darwin-sandbox/20/execroot/phd/bazel-out/darwin-py3-opt/bin/deeplearning/clgen/clgen_test.runfiles/phd/deeplearning/clgen/clgen.py:100 init()
4 /private/var/tmp/_bazel_sohilshrestha/f8ee67cead6a3e5516303f5e0dd3d4e7/sandbox/darwin-sandbox/20/execroot/phd/bazel-out/darwin-py3-opt/bin/deeplearning/clgen/clgen_test.runfiles/phd/deeplearning/clgen/clgen.py:244 DoFlagsAction()
5 /private/var/tmp/_bazel_sohilshrestha/f8ee67cead6a3e5516303f5e0dd3d4e7/sandbox/darwin-sandbox/20/execroot/phd/bazel-out/darwin-py3-opt/bin/deeplearning/clgen/clgen_test.runfiles/phd/deeplearning/clgen/clgen.py:205 RunContext()`
I ran the recommended test for clgen. 10 out of 21 passed. To run it on macOS, I have created a virtualenv and ran the code there. The python version used was 3.6.5 I see issue with bazel coming across the https://github.com/tensorflow/tensorflow/issues/10436. The issue encountered was
ERROR: /private/var/tmp/_bazel_sohilshrestha/f8ee67cead6a3e5516303f5e0dd3d4e7/external/base/image/BUILD:6:1: Couldn't build file external/base/image/002.tar.gz.nogz.sha256: SHA256 external/base/image/002.tar.gz.nogz.sha256 failed (Exit 1): sha256 failed: error executing command (cd /private/var/tmp/_bazel_sohilshrestha/f8ee67cead6a3e5516303f5e0dd3d4e7/execroot/phd && \ exec env - \ bazel-out/host/bin/external/bazel_tools/tools/build_defs/hash/sha256 bazel-out/darwin-opt/bin/external/base/image/002.tar.gz.nogz bazel-out/darwin-opt/bin/external/base/image/002.tar.gz.nogz.sha256) Use --sandbox_debug to see verbose messages from the sandbox Traceback (most recent call last): File "bazel-out/host/bin/external/bazel_tools/tools/build_defs/hash/sha256", line 203, in <module> Main() File "bazel-out/host/bin/external/bazel_tools/tools/build_defs/hash/sha256", line 176, in Main raise AssertionError('Could not find python binary: ' + PYTHON_BINARY) AssertionError: Could not find python binary: python3.6
There are few other error as well `path = PosixPath('/var/folders/68/_gxs799d0930bmq_170px4fw0000gn/T/clgen_abc_corpus_ghp3hxfo')
def GetDirectoryMTime(path: pathlib.Path) -> int:
"""Get the timestamp of the most recently modified file/dir in directory.
Recursively checks subdirectory contents. This requires that the directory
exists and is not empty.
Params:
abspath: The absolute path to the directory.
Returns:
The seconds since epoch of the last modification.
"""
# Pure python implementation.
# return int(max(
# max(os.path.getmtime(os.path.join(root, file)) for file in files) for
# root, _, files in os.walk(path)))
# Faster implementation using UNIX tools. Requires GNU xargs, which supports
# the '-d' argument, which is needed to support file names with spaces. On
# macOS, this means having the homebrew findutils package installed, and
# the following directory in your PATH:
# /usr/local/opt/findutils/libexec/gnubin
output = subprocess.check_output(
f"find '{path}' -type f | xargs -d'\n' stat -c '%Y:%n' | sort -t: -n | "
"tail -1 | cut -d: -f1", universal_newlines=True, shell=True)
return int(output)E ValueError: invalid literal for int() with base 10: ''`
Hi there, sorry Iām writing this on my ipad so canāt test the fix - but I think I see what the problem is. If you find the file which contains the function ādef GetDirectoryMTime(ā, youāll see in the comment āPure python implementationā, and then return int(max(...
. If you uncomment that return statement, it should fix the error.
The problem is that Iāve hardcoded a reference to GNU xargs
command, and macOS ships with a BSD implementation. Iāll fix up the docs / code to work around this. Thanks for reporting the issue!
Hi Chris and @50417 ,
I tried to run the code for creating a corpus for a language,
when I run the first code,
bazel run //datasets/github/scrape_repos/scraper --clone_list $PWD/clone_list.pbtxt
it gave me an error about 'ERROR: Unrecognized option: --clone_list'
Do you know how to solve it?
Hi @JiajieZhang-Georgia , woops I'm sorry, I missed a --
in the README. The command is:
bazel run //datasets/github/scrape_repos:scraper -- --clone_list $PWD/clone_list.pbtxt
HI @ChrisCummins ,
Are there any updates on this ?
Is it possible to port CLgen to any other OS environments like Windows or other dialects of Linux. ?
Hey @50417, thanks for your patience! :-) I can see you've made good progress on adapting it to Simulink. If you're looking for specific help with your project I may be able to help out - I would also be interested in getting your work upstream. If you're interested in collaborating, shoot me an email at chrisc.101@gmail.com
Cheers, Chris
Hello everyone, I have created a bare minimum CLgen using basic python script(without need for bazel) here. Let me know if there are any issues and can this issue be closed .
Interesting! What, in your experience, is the biggest issue for using this project that your fork overcomes?
The biggest issue was I had to rebuilt all of your projects in the phd project. Although learning bazel had a bit of a learning curve, it does not officially support Python and the fact that it is still in beta was an issue when there were bugs .
In the paper, you mentioned that your RNN can easily be ported to any other programming language. We are trying to validate that claim . Can you provide tutorial if it is possible to just use the RNN model in the code base?