duckduckgo / zeroclickinfo-spice

DuckDuckGo Instant Answers based on JavaScript (JSON) APIs
https://duckduckhack.com/
Other
548 stars 942 forks source link

CodeSearch Issue #65

Closed boyter closed 12 years ago

boyter commented 12 years ago

https://github.com/duckduckgo/zeroclickinfo-spice/blob/master/lib/DDG/Spice/CodeSearch.pm

I just realized a bug that has crept into this. The "lang:" should prefix the language words.

EG.

code python foreach TO lang:python foreach
code php $this-> TO $this-> lang:php
code goto TO goto lang:perl

As it is you end up with queries like,

code foreach TO lang:foreach

Which returns nothing useful at all, as the lang syntax is trying to restrict to the langage "foreach" which is nonsensical. The syntax is copied from the original google code search. http://en.wikipedia.org/wiki/Google_Code_Search

Noticed when going through some logs. An example that's on the live site would be,

http://duckduckgo.com/?q=code+foreach

boyter commented 12 years ago

Had a dig and found the following examples of searches which are hitting the api, but are probably not useful.

lang:ehcache
lang:battle field 3 multiplayer ps3
lang:pthread_mutex
lang:ehcache
lang:csv file
lang:att wireless coupon
lang:mediawiki inline
lang:chmod for read write execute
lang:ehcache.xml
lang:fire emblem 8 cheat
lang:xfetchname
lang:android doubletwist retail
lang:yacc source
lang:search for copy and paste
lang:pthread_mutex_t
lang:plantronics 360 bluetooth pairing

Some would produce useful results EG,

pthread_mutex_t

I have modified the searchcode end to ignore nonsensical language lookups EG

http://duckduckgo.com/?q=example+pthread_mutex http://duckduckgo.com/?q=example+pthread_mutex_t http://duckduckgo.com/?q=example+ehcache

now returns something which makes sense.

moollaza commented 12 years ago

@boyter sorry for the delayed response and thanks for letting me know about this bug. I think the bigger problem is that the spice is triggering simply on "code" or "example" and following through with a result despite the absence of a "language" . I thought I had written the spice so that it would force the presence of one of your keywords - however it seems my code isn't doing that so I believe there is an error in my regex. I'll fix it up and we can go from there.

EDIT: Also I think I'll use the regex optimizer I previously used for the ExpandURL Spice to make the regex a little more efficient

boyter commented 12 years ago

No problem at all. I just fixed it on my end as much as possible so the results shouldn't be too bad now. The experience was really crappy before I made the modification as it was always showing a make file for all results before. Looks bad for DDG and for myself, so I fixed it, and made it more resilient anyway.

In some cases it actually works out better such as the examples above as they do give what I would expect.

moollaza commented 12 years ago

@boyter I think i figured it out! small mistake...the list of words in the $words variable weren't surrounded by '\b' so it was picking up the "|c|" in any words that had just the letter c in them hence "example foreach" worked but "example testing" didn't trigger.

Now it will force your keywords to be present. Better?

boyter commented 12 years ago

@moollaza Yep, with the only exception being you are assuming that the language keyword is at the beginning of the query, which might not be the case. IE

example csv validator java TO lang:csv validator java

Would try to match on the language csv which does not exist. It's taken care of on the searchcode side though as I strip out the lang: portion where there is a language which matches nothing. I might modify it on that side anyway to search for keywords like java in the case of there being no lang syntax present and restrict it that way.

moollaza commented 12 years ago

@boyter I had thought about that right after I made my earlier fix. I changed the regex to now strip the keyword, append "lang" to it, then append the rest of the search query. I pushed it to github, feel free to give it a look.

boyter commented 12 years ago

Looks good to me. I have been playing a bit and it certainly seems to pick things up much better now.