sourcemash / Sourcemash

Read Your News Faster.
GNU General Public License v2.0
3 stars 0 forks source link

Categorization bug #317

Closed swglad closed 9 years ago

swglad commented 9 years ago

Traceback (most recent call last): File "manage.py", line 180, in manager.run() File "/Users/scottgladstone/Sourcemash/flask/lib/python2.7/site-packages/flask_script/init.py", line 412, in run result = self.handle(sys.argv[0], sys.argv[1:]) File "/Users/scottgladstone/Sourcemash/flask/lib/python2.7/site-packages/flask_script/init.py", line 383, in handle res = handle(_args, _config) File "/Users/scottgladstone/Sourcemash/flask/lib/python2.7/site-packages/flask_script/commands.py", line 216, in call return self.run(_args, _kwargs) File "manage.py", line 119, in feed_seed categorize_feed_articles(feed, categorizer) File "/Users/scottgladstone/Sourcemash/worker/scraper.py", line 49, in categorize_feed_articles categories = categorizer.categorize_item(item.title, text_only) File "/Users/scottgladstone/Sourcemash/worker/categorize.py", line 96, in categorize_item self._memoize_related_articles(keyword_candidates.keys()) File "/Users/scottgladstone/Sourcemash/worker/categorize.py", line 171, in _memoize_related_articles self._scrape_wiki_links(ngrams) File "/Users/scottgladstone/Sourcemash/worker/categorize.py", line 203, in _scrape_wiki_links data = json.loads(resp.text) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/init.py", line 338, in loads return _default_decoder.decode(s) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 365, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 383, in raw_decode raise ValueError("No JSON object could be decoded") ValueError: No JSON object could be decoded

On this item:

{
item: {
author: "Google Blogs (noreply@blogger.com)",
categories: [ ],
feed: {
description: "Insights from Googlers into our products, technology, and the Google culture.",
id: 12,
image_url: "http://fineprintnyc.com/images/blog/history-of-logos/google/google-logo.png",
last_updated: "Tue, 19 May 2015 01:07:02 -0000",
subscribed: true,
title: "The Official Google Blog",
topic: "Technology",
unread: true,
url: "http://googleblog.blogspot.com/atom.xml"
},
id: 238,
image_url: "http://feeds.feedburner.com/~r/blogspot/MKuf/~4/Dw2p04o4XtI",
last_updated: "Tue, 05 May 2015 21:00:01 -0000",
link: "http://feedproxy.google.com/~r/blogspot/MKuf/~3/Dw2p04o4XtI/doing-more-on-diversity.html",
saved: false,
summary: "When we <a href="http://googleblog.blogspot.com/2014/05/getting-to-work-on-diversity-at-google.html">released</a> the composition of our workforce almost a year ago, it confirmed what many people suspected: the tech industry needs to do a lot more when it comes to diversity. Since then, the question I get asked most is—so what are you doing about it? <br /><br />You may have heard about some of the work we’ve been doing: <a href="http://bigstory.ap.org/article/4312d33e1cb8454a9885f230d35f0eb1/google-embeds-engineers-professors">embedding engineers</a> at Historically Black Colleges and Universities; <a href="http://www.usatoday.com/story/tech/2015/03/18/google-abc-disney-pair-up-to-promote-images-of-girls-and-computer-science/24903551/">partnering with Hollywood</a> to inspire girls to pursue careers in computer science; building <a href="http://www.postandcourier.com/article/20141003/PC1213/141009834/1003/">local initiatives</a> to introduce coding to high school students from diverse communities; and expanding our employee unconscious bias <a href="http://googleblog.blogspot.com/2014/09/you-dont-know-what-you-dont-know-how.html">training</a>. <br /><br />But these programs represent only a sampling of all the work that is going on behind the scenes. If we’re really going to make an impact, we need a holistic plan. Today, we want to <a href="http://www.usatoday.com/story/tech/2015/05/05/google-raises-stakes-diversity-spending/26868359/">share</a> our diversity strategy, which is focused on four key areas: <br /><br /><b>Hire diverse Googlers: </b>In the past, our university-focused hiring programs have relied heavily on a relatively small number of schools. But, we know those schools aren't always the most diverse. For example, while <a href="http://nces.ed.gov/fastfacts/display.asp?id=98">14% of Hispanic college enrollment is at 4-year schools</a>, Hispanics make up just <a href="https://www.insidehighered.com/news/2014/06/18/new-book-discusses-diversity-strategies-dont-consider-race#sthash.PJ6995ZV.HyygIdnv.dpbs">7% at the 200 most selective schools</a>. In the past two years, we've doubled the number of schools where we recruit, to promote student diversity. This year, nearly 20 percent of the hires we make from a university are from these new campuses.<br /><br /><b>Foster a fair and inclusive culture:</b> We want to ensure that we have an environment where all Googlers can thrive. We’ve raised awareness around <a href="http://googleblog.blogspot.com/2014/09/you-dont-know-what-you-dont-know-how.html">unconscious bias</a>—half of all Googlers have participated in our unconscious bias workshops—and we’ve now rolled out a hands-on workshop that provides practical tips for addressing bias when we see it. We’re also drawing on the idea of <a href="http://googleblog.blogspot.com/2006/05/googles-20-percent-time-in-action.html">20 percent</a> time to enable employees to use their time at work to focus on diversity projects. In 2015, more than 500 Googlers will participate in Diversity Core, a formal program in which employees contribute—as part of their job—to the company’s diversity efforts. <br /><br /><b>Expand the pool of technologists:</b> Making computer science (CS) education accessible and available to everyone is one of our most important initiatives. Our <a href="http://www.cs-first.com/">CS First</a> program is designed to help anyone—a teacher, a coach, or volunteer—teach kids the basics of coding. And since <a href="https://docs.google.com/a/google.com/file/d/0B-E2rcvhnlQ_a1Q4VUxWQ2dtTHM/edit">research</a> tells us that to inspire more girls, we need to show them that computer science isn’t just for boys, we started <a href="http://madewithcode.com/">Made with Code</a>—and we’re working with the entertainment industry to <a href="http://www.usatoday.com/story/tech/2015/03/18/google-abc-disney-pair-up-to-promote-images-of-girls-and-computer-science/24903551/">change the perceptions around CS</a> and what it means to be a computer scientist.<br /><br /><b>Bridge the digital divide:</b> We also want more underrepresented communities, including women and minorities, to share the benefits of the web, and to have access to the economic engine it provides. The <a href="http://accelerate.withgoogle.com/">Accelerate with Google Academy</a> helps business owners get online, grow and drive economic impact. <br /><br />With an organization of our size, meaningful change will take time. From one year to the next, bit by bit, our progress will inch forward. More importantly, our industry will become more inclusive, and the opportunities for currently underrepresented groups will grow. We’ll share our updated diversity data for 2015 soon. We’re gradually making progress across these four areas, and we’re in it for the long term.<br /><br /><span class="byline-author">Posted by Nancy Lee, Vice President, People Operations</span><img alt="" height="1" src="http://feeds.feedburner.com/~r/blogspot/MKuf/~4/Dw2p04o4XtI" width="1" />",
title: "Doing more on diversity",
unread: true,
vote: 0,
voteSum: 0
}
}