github-linguist / linguist

Language Savant. If your repository's language is being reported incorrectly, send us a pull request!
MIT License
12.28k stars 4.25k forks source link

Failure to recognize bash scripts #853

Closed nima closed 9 years ago

nima commented 10 years ago

Given a directory or file with files that do not end in .sh, or any other extension, but that do have the magic and interpreter path #!/bin/bash in place - linguist fails to recognize the language as bash:

S:(linguist:master:/|✔)=0/0$ head -n1 ../../module/cpf
#!/bin/bash
S:(linguist:master:/|✔)=0/0$ bundle exec linguist ../../module/cpf
../../module/cpf: 403 lines (358 sloc)
  type:      Text
  mime type: text/plain
  language:  
tnm commented 10 years ago

Is the file mode marked as executable?

nima commented 10 years ago

No, there is only 1 bash file in the entire repo that is executable, all other bash files are simple sourced as required.

nima commented 10 years ago

I should note that - given that linguist utilizes statistical analysis - then there's a good chance that my function names may be throwing them off - as I use the colon character way more than that which you would find in a typical shell script; as you may or may not know - colon is a valid character for function names, and I've used it in segregating my bash functions into pseudo-modules, for example:

function module:function() {
    ...
}

I wrote a proof-of-concept language detector once myself which used a 2-layer neural network to detect languages, and also - like linguist - the ability to classify a .h file vs a .c file - and something like this would have definitely confused it - but I don't know how linguist does its magic - just thought I should share this in case it helps :)

tnm commented 10 years ago

Thanks. Can you provide a link to the repo/file if it's open source?

nima commented 10 years ago

Of course - retarded of me not to do so in my first post in fact; given that it's hosted on GitHub :/

https://github.com/nima/site

nima commented 10 years ago

I've submitted a pull request (my first one ever in fact!) - with a small patch to allow language/interpreter override using a keyword anywhere in the first 5 lines; issue #894.

arfon commented 9 years ago

https://github.com/github/linguist/pull/1515 should fix this.