github-linguist / linguist

Language Savant. If your repository's language is being reported incorrectly, send us a pull request!
MIT License
12.25k stars 4.24k forks source link

Prolog interpreted as Perl #435

Closed maebert closed 10 years ago

maebert commented 11 years ago

.pl files containing prolog code are often classified as Perl.

Related to #394, which basically makes the page listing the most popular Prolog repos completely pointless as exactly none of the projects listed there contain any Prolog source files...

tnm commented 11 years ago

Yep, known issue. Hope to have some fixes here and in related mis-categorizations soon.

pmoura commented 10 years ago

Linguist is once again misclassifying Prolog files as Perl files. This is a quite recent regression, possibly introduced today.

pmoura commented 10 years ago

The problem seems to be that all samples of Prolog source code have been deleted from the repository!

pmoura commented 10 years ago

Culprit is commit f6034b85fbc78d3f332e52db03441cbced05fefb. Allow me to make this cristal clear: nobody in the Prolog programming community is using .prolog as a an extension for Prolog source files. The de facto standard extension for Prolog source files is .pl. Thus, all repositories with Prolog code will start being reclassified as Perl as soon as their language statistics are recalculated using the current linguist version. Is already bad enough that Perl developers fucked up big time by overtaking a file extension, .pl, used by Prolog, a language that was already 15 years old by that time Perl appeared. Why undoing the work done in linguist to disambiguate between the two languages?!?

tnm commented 10 years ago

Please take a look at relevant commits over the past two days. Thanks.

pmoura commented 10 years ago

The removed samples provided a much more comprehensive coverage of the Prolog language compared with the two samples that replaced them. That said, only recalculating the language statistics of a few repositories with Prolog source code will show the existence of possible issues. I can point out some of these repositories for testing if necessary.

pmoura commented 10 years ago

Is the new code already in use? All the repositories with a Prolog code and zero Perl code are still misclassified. See e.g. https://github.com/LogtalkDotOrg/logtalk3 and https://github.com/pdt-git/public

pmoura commented 10 years ago

A couple of examples of repositories with no Perl code, only Prolog code, and showing wrong language statistics are https://github.com/Anniepoo/prolog-examples and https://github.com/ganeshkumaruk/Prolog

z5h commented 10 years ago

+1 please identify .pl files as Prolog when they are Prolog.

MonkeyIsNull commented 10 years ago

Bump up for Prolog files still being incorrectly identified. here are some example: https://gist.github.com/MonkeyIsNull/8797442 https://gist.github.com/MonkeyIsNull/7575854 https://gist.github.com/MonkeyIsNull/6772168

wouterbeek commented 10 years ago

+1

Anniepoo commented 10 years ago

Please sir, may we have an extension? Or at least revert out t f6034b8. Seriously, we're fighting an uphill battle to be taken seriously as a language community. Fixing this would be appreciated.

luxe commented 10 years ago

+1 please identify .pl files as Prolog when they are Prolog. Even the file extension ".prolog" shows up as "Other" when identified.

jansegre commented 10 years ago

+1 also experiencing this on my repository: https://github.com/jansegre/jwar which contains some prolog and no perl at all.

luiscleto commented 10 years ago

+1 here as well. Two of my repositories which only contain Prolog code, no Perl but the code got identified as Perl. In https://github.com/luiscleto/feup-plog-hanjie-solver, some files got detected as Prolog, but most didn't.

guenterk commented 10 years ago

+1 On our https://github.com/pdt-git/public repository we have only Prolog and Java code. Most of the Prolog code is misclassified as Perl although all files use the same extension: language statistics counts prolog as perl

Anniepoo commented 10 years ago

Just to let you know, we're still frustrated by this.

Anniepoo commented 10 years ago

And it's still an issue. It looks like my newer code is being correctly classified, but my older code is still listed as Perl.

pmoura commented 10 years ago

Maybe the language statistics are only recalculated on new commits?

Enerccio commented 8 years ago

I don't think this is fixed...

arfon commented 8 years ago

Can you link to a file that's being incorrectly classified?

Enerccio commented 8 years ago

https://github.com/jansegre/jwar this one still reports perl

arfon commented 8 years ago

Ah ok. It hasn't been updated for over a year and language stats are only updated when a push is made to the code.

jansegre commented 8 years ago

I'll make a dummy push then. Thanks.

On Sun, Nov 8, 2015, 12:24 Arfon Smith notifications@github.com wrote:

Ah ok. It hasn't been updated for over a year and language stats are only updated when a push is made to the code.

— Reply to this email directly or view it on GitHub https://github.com/github/linguist/issues/435#issuecomment-154829440.

Anniepoo commented 8 years ago

we're accumulating a bunch of them that are misclassified again

https://github.com/search?l=perl6&p=5&q=put_attr&type=Code&utf8=%E2%9C%93

arfon commented 8 years ago

@Anniepoo - looks like the class keyword is causing trouble in the Perl6 regex code: http://rubular.com/r/t6eSZ338M1

Anniepoo commented 8 years ago

Thanks for the quick analysis! Yes, both class and module are used by Prolog as well - module is particularly problematic, as module and use_module ar the Prolog equivilents of Java's package and import statements.

Anniepoo commented 8 years ago

got a recruit to do a fix, stay tuned

batchmcnulty commented 6 years ago

Great. Now it thinks one of my perl programs is a Prolog program! >:-(

Anniepoo commented 6 years ago

@batchmcnulty , can you point us at the file(s)?

Should we open a new bug? This one's closed

batchmcnulty commented 6 years ago

Thanks - was happening to uberscan at

https://github.com/batchmcnulty/uberscan

but I've fixed it by adding the .gitattributes file containing this text:

*.pl linguist-language=Perl

which seems to fix the problem. phew

batchmcnulty commented 6 years ago

It's not a Prolog program, so unless I'm missing something, breaking it for Prolog isn't a problem. It's a Perl program so it only runs on Perl. If you tried to run it as a Prolog program, you'd be out of luck anyway, right? O.o

Anniepoo commented 6 years ago

Oh, neat! I didn't know about this. 8cD will spread in prolog community.

Anniepoo commented 6 years ago

and re 'its not a Prolog program, so...' , yeah , I misunderstood

egryaznov commented 6 years ago

+1. After 5 (sic!) years this is still relevant: https://github.com/mxw/vim-prolog

jansegre commented 6 years ago

@egryaznov from my understanding, the repository needs to receive a push to trigger linguist to run, so that one will stay like that if left untouched.

lildude commented 6 years ago

@egryaznov from my understanding, the repository needs to receive a push to trigger linguist to run, so that one will stay like that if left untouched.

Correct. We do not go back and update the cached language stats whenever we update Linguist as this would be prohibitively resource intensive to do this for every repo on GitHub.com. That repo hasn't been touched in over 5 years and thus shows the results of the last analysis.

Linguist has moved on quite a bit in that time and if you fork that repo, a new analysis will be performed on the fork and you'll see the .pl file is correctly identified as Prolog now.

vagman commented 3 years ago

I still get this! Only workaround is to call your files .prolog but the problem here is that Visual Studio Code reads it as Plain Text then...the Prolog plugin won't work properly.

lildude commented 3 years ago

@vagman can you please open a new issue linking to your repo experiencing this as Prolog uses the .pl extension as the default now:

https://github.com/github/linguist/blob/7c2adbdb15d4efd25d92cd5ed20d0025f0d32d28/lib/linguist/languages.yml#L4262-L4275

... so if it's not detecting your files correctly, something else may be at play here.

ProphetPX commented 3 years ago

It's now May 2021!!!! Hello is anything being done to fix this? My PERL files are showing up as Prolog just simply because they have a .pl filename extension! :( https://github.com/ProphetPX/PF1eCharBuilder/

pmoura commented 3 years ago

@ProphetPX Poetic justice? Given that Prolog was using the .pl extension long before Perl was created? 😛 Kidding aside, a quick fix is for you to add a .gitattributes to the root of your repo with the following contents:

*.pl  linguist-language=Perl

You may need to push some commits to this solution to take effect, however. Hope this helps.

lildude commented 3 years ago

It's now May 2021!!!! Hello is anything being done to fix this?

It is indeed May 2021 and Linguist still relies on the community for constructive contributions for discrepancies like this.

That said, you kinda made it harder for Linguist than it really needs to be, even before the extension is taken into account: your first line is not a valid shebang and looking for the shebang is the first thing Linguist does.

Fixing your shebang will solve your issue. I've sent you a PR for the fix 😁.