OscarPDR / labman_ud

Web app to manage all related data within any Research Unit(s). It allows to describe researchers, projects, publications, funding programs, etc. in order to display them in a clear way and create interactive graphs which analyse the unit(s) performance.
http://morelab.deusto.es
GNU General Public License v3.0
10 stars 6 forks source link

Show "citations" on the publications section #68

Open porduna opened 10 years ago

porduna commented 10 years ago

It would be cool to show "citations" in each publication, so you can see the publications which cite each publication. This would be a simple link to Google Scholar pointing to this publication.

Doing so semi-automatically is not too difficult. But it requires the model to change and add a new field, such as "scholar_id", so in the visualization we can provide the link.

The way to do this could be:

scholar.google.com/scholar?q=%22TITLE_OF_THE_PAPER%22author1%20author2

Find the first "cache:([^:]+):" regexp (which comes from something like the following):

<a href="http://scholar.googleusercontent.com/scholar?q=cache:bPrZe0NyrG0J:scholar.google.com/+Learning+Analytics+on+federated+remote+laboratories:+tips+and+techniques&amp;hl=es&amp;as_sdt=0,5" class="gs_nvi">Versión en HTML</a>

And that's the identifier in base64. Then you can put:

http://scholar.google.es/scholar?cites=bPrZe0NyrG0J

to show the cites.

Example:


import re
import urllib2

req = urllib2.Request("http://scholar.google.es/scholar?q=%22Towards+federated+interoperable+bridges+for+sharing+educational+remote+laboratories%22", headers = { 'User-Agent' : 'Google Chrome' })

urlobj = urllib2.urlopen(req)
contents = urlobj.read()

regex = re.compile(r"cache:([^:]+):")
token = regex.findall(contents)[0]
print token

# This returns: bPrZe0NyrG0J
# If we want to pass to the numeric form:
hex_token = ""
for byte in token.decode('base64')[::-1][1:]:
   hex_token += hex(ord(byte)).split('0x')[1].zfill(2)

print int(hex_token, 16)

# With both identifiers:
# 
# "http://scholar.google.es/scholar?cites=" + identifier
# 
# works.

If the paper is not yet in Google Scholar, it should not be added.

porduna commented 10 years ago

@OscarPDR could you please add a "scholar_id" field in the abstract publication you're working on?