pre-commit / identify

File identification library for Python
MIT License
245 stars 142 forks source link

Identify named pipes and sockets #72

Open mitzkia opened 5 years ago

mitzkia commented 5 years ago

The current version (v1.4.0) can not identify named pipes and socket files. I do not know if identify should work on this type of files, but I can show the reproduction, and can add a suggestion for fixing it.

Reproduction for named-pipes (in the last command it hangs):

$ mkfifo /tmp/my-custom-named-pipe
$ file /tmp/my-custom-named-pipe
/tmp/my-custom-named-pipe: fifo (named pipe)
$ identify-cli /tmp/my-custom-named-pipe

bt:

^CTraceback (most recent call last):
  File "/usr/local/bin/identify-cli", line 10, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.6/dist-packages/identify/cli.py", line 23, in main
    tags = sorted(func(args.path))
  File "/usr/local/lib/python3.6/dist-packages/identify/identify.py", line 66, in tags_from_path
    if file_is_text(path):
  File "/usr/local/lib/python3.6/dist-packages/identify/identify.py", line 128, in file_is_text
    with open(path, 'rb') as f:
KeyboardInterrupt

Reproduction for socket files:

$ python -c "import socket as s; sock = s.socket(s.AF_UNIX); sock.bind('/tmp/my-custom-socket')"
$ file /tmp/my-custom-socket 
/tmp/my-custom-socket: socket
$ identify-cli /tmp/my-custom-socket

bt:

Traceback (most recent call last):
  File "/usr/local/bin/identify-cli", line 10, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.6/dist-packages/identify/cli.py", line 23, in main
    tags = sorted(func(args.path))
  File "/usr/local/lib/python3.6/dist-packages/identify/identify.py", line 59, in tags_from_path
    shebang = parse_shebang_from_file(path)
  File "/usr/local/lib/python3.6/dist-packages/identify/identify.py", line 172, in parse_shebang_from_file
    with open(path, 'rb') as f:
OSError: [Errno 6] No such device or address: '/tmp/my-custom-socket'

My suggestion to fix these issues is to use stat module of python. With the help of this modul identify could check all other file types as well. URL: https://docs.python.org/3/library/stat.html

If possible I can propose a PR with pleasure.

asottile commented 4 years ago

sounds fine to me! would you like to propose a PR?