anapgh / pycefr

6 stars 12 forks source link

The tool failed when encountering a file without extension #3

Closed cragkhit closed 3 years ago

cragkhit commented 3 years ago

Hi. Thanks for creating the tool. Very useful! I've tried running it using the directory mode on the pytorch project and found that the tool failed when it found a file without an extension. It seems like the tool thinks that it is a folder.

Steps to replicate

  1. Clone the pytorch project git clone https://github.com/pytorch/pytorch.git
  2. Run python3 pycerfl.py directory ~/Downloads/pytorch
  3. The follow error message is displayed
Directory:
['CITATION', 'CODE_OF_CONDUCT.md', '.flake8', '.azure_pipelines', '.bazelversion', 'codecov.yml', 'tools', 'docker', 'CMakeLists.txt', 'pytest.ini', 'LICENSE', 'requirements.txt', 'test', 'cmake', 'docker.Makefile', '.cmakelintrc', 'Dockerfile', 'Makefile', 'caffe2', 'GLOSSARY.md', 'CODEOWNERS', 'WORKSPACE', '.clang-tidy', 'mypy-strict.ini', 'torch', 'MANIFEST.in', '.coveragerc', 'requirements-flake8.txt', 'docs', '.gdbinit', '.gitmodules', 'ios', 'NOTICE', 'README.md', 'mypy.ini', 'RELEASE.md', 'setup.py', '.dockerignore', 'third_party', '.gitignore', 'CONTRIBUTING.md', 'binaries', 'benchmarks', 'android', '.ctags.d', 'scripts', 'submodules', '.clang-format', '.github', '.gitattributes', 'ubsan.supp', 'aten', 'BUILD.bazel', 'c10', 'mypy_plugins', 'version.txt', '.git', '.vscode', 'modules', 'aten.bzl', '.circleci', '.bazelrc', '.jenkins']

Opening another directory...

Directory:
Traceback (most recent call last):
  File "pycerfl.py", line 220, in <module>
    choose_option()
  File "pycerfl.py", line 38, in choose_option
    read_Directory(option, repo)
  File "pycerfl.py", line 180, in read_Directory
    read_Directory(path2, directory[i])
  File "pycerfl.py", line 170, in read_Directory
    directory = os.listdir(path)
NotADirectoryError: [Errno 20] Not a directory: '/Users/chaiyong/Downloads/pytorch/CITATION'

After removing the CITATION file, the tool could continue. Then, it failed again at another point, but with different error message.

Directory:
['CODE_OF_CONDUCT.md', '.flake8', '.azure_pipelines', '.bazelversion', 'codecov.yml', 'tools', 'docker', 'CMakeLists.txt', '.DS_Store', 'pytest.ini', 'LICENSE', 'requirements.txt', 'test', 'cmake', 'docker.Makefile', '.cmakelintrc', 'Dockerfile', 'Makefile', 'caffe2', 'GLOSSARY.md', 'CODEOWNERS', 'WORKSPACE', '.clang-tidy', 'mypy-strict.ini', 'torch', 'MANIFEST.in', '.coveragerc', 'requirements-flake8.txt', 'docs', '.gdbinit', '.gitmodules', 'ios', 'NOTICE', 'README.md', 'mypy.ini', 'RELEASE.md', 'setup.py', '.dockerignore', 'third_party', '.gitignore', 'CONTRIBUTING.md', 'binaries', 'benchmarks', 'android', '.ctags.d', 'scripts', 'submodules', '.clang-format', '.github', '.gitattributes', 'ubsan.supp', 'aten', 'BUILD.bazel', 'c10', 'mypy_plugins', 'version.txt', '.git', '.vscode', 'modules', 'aten.bzl', '.circleci', '.bazelrc', '.jenkins']

Opening another directory...

Directory:
['clang_format_utils.py', 'clang_format_all.py', 'vscode_settings.py', 'gdb', 'extract_scripts.py', 'clang_format_ci.sh', 'test', 'clang_tidy.py', 'translate_annotations.py', 'fast_nvcc', 'mypy_wrapper.py', 'git_add_generated_dirs.sh', 'config', 'autograd', 'explicit_ci_jobs.py', 'run_shellcheck.sh', 'print_test_stats.py', 'amd_build', 'setup_helpers', '__init__.py', 'actions_local_runner.py', 'git-pre-commit', 'export_slow_tests.py', 'shared', 'codegen', 'README.md', 'render_junit.py', 'trailing_newlines.py', 'pytorch.version', 'build_variables.bzl', 'build_libtorch.py', 'jit', 'stats_utils', 'lite_interpreter', 'rules', 'pyi', 'coverage_plugins_package', 'git-clang-format', 'generate_torch_version.py', 'clang_format_hash', 'code_analyzer', 'code_coverage', 'test_history.py', 'download_mnist.py', 'flake8_hook.py', 'build_pytorch_libs.py', 'nightly.py', 'generated_dirs.txt', 'git_reset_generated_dirs.sh']
Python File: clang_format_utils.py
Python File: clang_format_all.py
Python File: vscode_settings.py

Opening another directory...

Directory:
['pytorch-gdb.py']
Python File: pytorch-gdb.py
Python File: extract_scripts.py

Opening another directory...

Directory:
['test_trailing_newlines.py', 'test_test_history.py', 'test_stats.py', 'test_extract_scripts.py', 'test_actions_local_runner.py', 'test_mypy_wrapper.py', 'test_translate_annotations.py']
Python File: test_trailing_newlines.py
Python File: test_test_history.py
Python File: test_stats.py
Python File: test_extract_scripts.py
Traceback (most recent call last):
  File "pycerfl.py", line 220, in <module>
    choose_option()
  File "pycerfl.py", line 38, in choose_option
    read_Directory(option, repo)
  File "pycerfl.py", line 180, in read_Directory
    read_Directory(path2, directory[i])
  File "pycerfl.py", line 180, in read_Directory
    read_Directory(path2, directory[i])
  File "pycerfl.py", line 176, in read_Directory
    read_File(pos, repo)
  File "pycerfl.py", line 189, in read_File
    iterate_List(tree, pos, repo)
  File "pycerfl.py", line 197, in iterate_List
    deepen(tree, attrib, pos, repo)
  File "pycerfl.py", line 203, in deepen
    object = IterTree(tree, attrib, file, repo)
  File "/Users/chaiyong/Desktop/pycefrl/ClassIterTree.py", line 32, in __init__
    self.locate_Tree()
  File "/Users/chaiyong/Desktop/pycefrl/ClassIterTree.py", line 41, in locate_Tree
    levels.levels(self)
  File "/Users/chaiyong/Desktop/pycefrl/levels.py", line 31, in levels
    level_Dict(self)
  File "/Users/chaiyong/Desktop/pycefrl/levels.py", line 124, in level_Dict
    if 'ast.List' in str(self.node.values[i].values):
AttributeError: 'Constant' object has no attribute 'values'
cragkhit commented 3 years ago

I've tried the repo-url and found that it also failed with the same issue.

pycefrl % python3 pycerfl.py repo-url https://github.com/cragkhit/GitHub-Crawler.git
Analyzing repository languages...

Python: 4288

Python 50% OK

Run url...
Cloning into 'GitHub-Crawler'...
remote: Enumerating objects: 22, done.
remote: Counting objects: 100% (3/3), done.
remote: Compressing objects: 100% (3/3), done.
remote: Total 22 (delta 0), reused 1 (delta 0), pack-reused 19
Receiving objects: 100% (22/22), 18.43 KiB | 539.00 KiB/s, done.
Resolving deltas: 100% (7/7), done.
The directory is: GitHub-Crawler
This script absolute path is  /Users/chaiyong/Desktop/pycefrl/GitHub-Crawler
Directory:
['LICENSE', 'README.md', '.git', 'getDataFromGitHub.py']

Opening another directory...

Directory:
Traceback (most recent call last):
  File "pycerfl.py", line 220, in <module>
    choose_option()
  File "pycerfl.py", line 40, in choose_option
    request_url()
  File "pycerfl.py", line 60, in request_url
    check_lenguage(option, protocol, type_git, user, repo)
  File "pycerfl.py", line 97, in check_lenguage
    run_url(url)
  File "pycerfl.py", line 111, in run_url
    get_directory(url)
  File "pycerfl.py", line 151, in get_directory
    get_path(name_directory)
  File "pycerfl.py", line 162, in get_path
    read_Directory(absFilePath, name_directory)
  File "pycerfl.py", line 180, in read_Directory
    read_Directory(path2, directory[i])
  File "pycerfl.py", line 170, in read_Directory
    directory = os.listdir(path)
NotADirectoryError: [Errno 20] Not a directory: '/Users/chaiyong/Desktop/pycefrl/GitHub-Crawler/LICENSE'
anapgh commented 3 years ago

Hi! Thank you for interesting in my tool. The errors that you show me, yesterday I solved but I forget update the new version to GitHub. Currently, the last versio is update, so, you could try again.

Now, I hope that you can use successfully.

cragkhit commented 3 years ago

Hi @anapgh. I've pulled the new update and found the problem has been fixed. Great. Thanks!