duckduckgo / zeroclickinfo-spice

DuckDuckGo Instant Answers based on JavaScript (JSON) APIs
https://duckduckhack.com/
Other
548 stars 942 forks source link

add explainxkcd.com to XKCD zero-click #21

Closed gghh closed 10 years ago

gghh commented 12 years ago

Randall often writes comics that relate closely to the US cultural background, difficult to grasp for non-US folks. That's why the website http://www.explainxkcd.com popped up.

Would it be possible to obtain the explainations above for the XKCD zero-click search? I suggest to add a button like "didn't get the catch?" so that the explaination is optional, but still available.

Cheers,

dhruvbird commented 12 years ago

+1

hunterlang commented 12 years ago

@gghh cool idea, care to implement? :smile:

gghh commented 12 years ago

@hunterlang i'd love to. I am reading your doc for spice plugins.

A few questions/remarks:

1) explainxkcd.com (EXKCD) is a wordpress blog, and provide RSS feeds. I guess this would be the way to retrieve the data, but you might have better ideas. Do you?

2) I observe that the EXKCD guy titles his posts consistently with the same title as Randall's XKCD (but not with the numbers). If my spice plugin becames a reality, I suppose I'll drop a mail to EXKCD kindly asking to don't broke this convention, otherwise the eXKCDplanations are impossible to retrieve.

3) a GET on HTML pages from EXKCD isn't really a good idea I suppose, since the urls include the date, i.e. http://www.explainxkcd.com/2012/04/27/emotion/ and there is no guarantee that dates match across the two sites. Moreover, parsing the HTML via javascript on the user's browser can only lead to disaster I think.

4) I see that in your code for the XKCD spice (file lib/DDG/Spice/Xkcd.pm) you refer to the url http://dynamic.xkcd.com/ -- what is that?

cheers,

hunterlang commented 12 years ago

@gghh a clean way to do it would be to just have an "explanation" link along with the "Prev" links. See spice.js for how the Prev link is implemented. The xkcd API includes the month, year, and title of the comic, so if you normalize the title you can easily create links to the explainxkcd blog (provided they continue to use a consistent naming scheme). The only things I'm worried about here are the clutter vs. usability tradeoff, as we try to keep the ZCI box as clean as possible, and the broken links we'd have if they change their naming scheme over on the wordpress blog.

To answer your question, http://dynamic.xkcd.com/ is xkcd's API url. We call it to get JSON results that are easily parse-able in JavaScript, as opposed to the HTML that's served from their regular site.

gghh commented 12 years ago

@hunterlang thanks for the prompt reply. I see your points; no data retrieval, just link. Tomorrow I'll send an email to the explainxkcd site curator, just to see how s/he feels about this. Without his/her cooperation, this thing can't go anywhere! About usability VS clutter: i'll see what I can do, but with fonts & button sizes I have zero experience.

gghh commented 12 years ago

@hunterlang the email has been sent. I am waiting for the answer; I will keep you posted!

gghh commented 12 years ago

@hunterlang the explainxkcd site curator, Jeff, just replied me. He likes the idea and he's happy to collaborate.

He confirmed that he names posts consistently with the comics.

gghh commented 12 years ago

Hello @hunterlang,

You can see I am quite high latency on this. I can't go full speed because of tight deadlines on my job.

I'd like to share my last findings on the ingredients necessary for this plugin.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ TL;DR: my enhancement to the XKCD spice is gonna need the fathead infrastructure I guess. We need a hashmap COMIC_TITLE --> EXKCD_URL. This can be bootstrapped by downloading the RSS xml files with

for i in {1..49}; do curl "http://www.explainxkcd.com/feed/?paged=$i" > "p$i.xml"; done

and then, DDG should subscribe to EXKCD RSS feeds (it's a wordpress blog). ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

== Long Version ==

I considered a "automatic" URL construction, w/o reading the RSS feeds, since the URLs in EXKCD are like http://www.explainxkcd.com/2012/05/18/klout/

But this would require the assumption that Jeff (EXKCD) always did, and always will, write the explanation on the same day as the comic comes out. This would be a big mistake; this assumption is a strong constraint for everybody and it's not gonna work.

So I believe that retrieving and caching/indexing the RSS feeds is the way to go. Now I would need some advice on how to proceed; am I allowed to exit the Spice perimeter and go into a Fathead thing?

FYI: as of today, 49 pages of feeds are available (it goes back to 2006; first post appear to be Sun, 01 Jan 2006 04:12:02. I'll ask Jeff to confirm this).

Moreover, I remarked that Google Reader does cache that much; the oldest feed available via the Google Reader webapp is from 2009.

cheers

sdball commented 12 years ago

Just saw this issue (I'm the dude who wrote the XKCD spice to start with). dynamic.xkcd.com is the server that provides XKCD's jsonp api. It was apparently created just to support the XKCD CLI interface but proved to be very easy to hook into.

As for hooking in this functionality, it's a neat idea. I think if the explainxkcd guy is amenable to some modfication/support then absolute best thing would be to have something like explainxkcd.com/$xkcd_id do a 301 redirect to his actual page. It's been a while since I did Wordpress programming so I don't know how feasible it is to add custom routes.

Parsing the RSS feed and trying to sync up the posts to the comics by title sounds like a brittle solution and I wouldn't recommend it.

Hmm, even more interesting is that he's apparently converting to a wiki. Maybe we could ask him to be sure to setup XKCD# friendly urls?

sdball commented 12 years ago

Ah, it looks like it's going to work out of the box.

http://www.explainxkcd.com/wiki/index.php?title=1090

That 1090 is the XKCD#. So generating links to add to the xkcd results should be trivial for comics going forward (and it looks like he's asking for community support to backfill the older comics).

gghh commented 12 years ago

thankyou @sdball for the heads up. That's absolutely better than the RSS thing.

jagtalon commented 10 years ago

@gghh Still interested in this? :)

gghh commented 10 years ago

@jagtalon sorry, no time. Feel free to close the issue.

jagtalon commented 10 years ago

@gghh Thanks--we'll ask the community to see if anyone is interested. :)

mattr555 commented 10 years ago

@jagtalon I'm interested...would a link in the "More at" bar be appropriate? This is what I was thinking: screenshot - 07212014 - 07 08 35 pm (this will be a lot easier than before because explainxkcd.com is now a wiki)

elebow commented 10 years ago

This is actually much easier now that Explain xkcd has moved to a wiki. You only need to generate a link to eg http://www.explainxkcd.com/wiki/index.php/1392

jagtalon commented 10 years ago

@mattr555 I like that idea. @chrismorast What do you think of this idea? https://github.com/duckduckgo/zeroclickinfo-spice/issues/21#issuecomment-49678545

jagtalon commented 10 years ago

@elebow Oh good!

chrismorast commented 10 years ago

@mattr555 @jagtalon , I think that would work but we should match it to the format you would see in wikipedia with the vertical line rather than the bullet: screen shot 2014-07-23 at 11 39 03 am

mattr555 commented 10 years ago

@chrismorast cool, check out #982.

chrismorast commented 10 years ago

:+1:

jagtalon commented 10 years ago

It's live! @gghh @elebow @mattr555