ScienceGist / sciencegist

ScienceGist - Science for everyone
http://www.sciencegist.com
MIT License
15 stars 3 forks source link

Pull in existing summaries from various APIs #7

Open jure opened 11 years ago

jure commented 11 years ago

PLOS, eLife, BMC all have versions of short summaries of their papers. We should try and pull those into ScienceGist.

Ben Miles https://twitter.com/bennmiles started with a version at the #hack4ac, which goes a little like this:

require 'json'
require 'nokogiri'
require 'open-uri'

# This is a short script to get demo content to populate an inial set of 'science gists'
# We started by trying to call the elife api directly, but it was bad
# then we tried to use content negotiation, but that is also flakey
# finally we searched the XML version of the paper with nokogiri
# then extract the xml node '<abstract abstract-type="executive-summary">'

  class Digest
    attr_accessor :gist, :doi  

    def initialize(doi, gist="no content")
      @doi = doi
      @gist  = doi2gist()
    end

    def doi2gist()
      strippeddoi = @doi.gsub(/dx.doi.org\//, '')
      uri = "http://elife.elifesciences.org/elife-source-xml/#{strippeddoi}"
      puts strippeddoi
      puts uri

      elife_raw_xml =  open(uri).read 
      digest_xml_path = 'abstract[abstract-type="executive-summary"]'

      xml = Nokogiri::XML(elife_raw_xml)
      digest = xml.css(digest_xml_path)
      digesttext = digest.text
      return digesttext.gsub(/abstract-2#{strippeddoi}.002eLife digest/, '')
    end
  end

#test doi1: 'dx.doi.org/10.7554/eLife.00782'
#test doi2: 'dx.doi.org/10.7554/eLife.00800'

digest = Digest.new('dx.doi.org/10.7554/eLife.00782')

puts digest.doi
puts digest.gist
jure commented 11 years ago

A first attempt for this was made in https://github.com/ScienceGist/sciencegist/commit/7a043ab3d7b1c3525dde6e70ef524880e1925274. That uses the paper_summary gem, which for now only knows how to get summaries for eLife papers. It's a start, but I'll leave this open since we want to import or link to much more than just eLife.

AnneTheAgile commented 10 years ago

I see that multiple summaries for a paper be submitted, but is there a way to tag them for the source's reputation? Maybe the upvote will be enough. This idea immediately reminded me of my MS/PhD classes in which we were assigned to read, summarize, and report on papers. +1 for such a great project!