angelmoratilla / digressit

Automatically exported from code.google.com/p/digressit
0 stars 0 forks source link

Blank pages on upgrade from v2 #113

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
Could it be because of the use of short codes?

http://linkeddata.jiscpress.org/examples-of-success/

should display this:

The early examples of publishing Linked Data tended to be undertaken as
experiments, or as part of the work of academics researching the
Semantic Web. This work was valuable, and taught the community much
about the issues that would need to be overcome. More recently, large
organisations have recognised the potential value of Linked Data, and
they have begun to publish their own content in this way.
<h2>BBC</h2>
[caption id="attachment_59" align="aligncenter" width="416" caption="BBC
reuses data from Wikipedia and MusicBrainz to build pages for every
artist or band"]<a
href="http://www.bbc.co.uk/music/artists/a3cb23fc-acd3-4ce0-8f36-1e5aa6a18432"><
img
class="size-full wp-image-59  " title="BBC Music Artists"
src="http://linkeddata.jiscpress.org/files/2010/02/BBC.png" alt=""
width="416" height="325" /></a>[/caption]

The BBC recognises the value of Linked Data[1], and puts these
principles to work in a number of recent initiatives, including their
Programmes[2] and Music[3] sites. The same approaches are currently
being applied to the corporation’s Natural History content[4], with
discrete identifiers for animals, species, habitats etc.

In each case, concepts (an episode, a series, a performer, a track, an
animal) are assigned unique and persistent web URIs (Berners-Lee’s first
and second rules). Human-readable content is available, as well as
representations in RDF, XML, JSON etc that are intended for
interpretation by software tools (Berners-Lee’s third rule). The Music
site re-uses identifiers created by MusicBrainz[5], and displays
descriptive content provided by contributors to Wikipedia and
MusicBrainz. BBC editorial enhancements are contributed back to
MusicBrainz, improving the quality of content available there. Rather
than following the more traditional model of specifying, procuring and
validating all content in-house, the BBC is actively exploring the
opportunities offered by participating in community efforts to build and
maintain valuable resources.

The approach being taken by the BBC makes it easier for them to refer in
a fine-grained manner to their own content across different properties,
and enables them to benefit from externally sourced content such as that
on MusicBrainz. The approach also has the added benefit of exposing
valuable BBC resources to third party application developers in a manner
that makes it straightforward to build products incorporating BBC
content. One early example of this is fanhu.bz[6], which builds
communities of interest around BBC programmes discussed on Twitter.
<p style="text-align: center;"></p>

[caption id="attachment_60" align="aligncenter" width="416"
caption="Fanhu.bz, displaying data from Twitter and the BBC about Doctor
Who"]<a href="http://fanhu.bz/b006q2x0"><img class="size-full
wp-image-60  " title="Fanhu.bz"
src="http://linkeddata.jiscpress.org/files/2010/02/fanhu.png" alt=""
width="416" height="325" /></a>[/caption]
<p style="text-align: center;"></p>

<h2>New York Times</h2>
Earlier this year, the <em>New York Times</em> announced its
intention[7] to enable access to its thesaurus of more than a million
terms describing people, places, organisations, subjects and creative
works reported in the paper.

In October, the paper released the first set of data[8]; 5,000 personal
names mapped to additional data from Freebase and DBpedia.

As with the BBC examples, data from the <em>New York Times</em> is made
available in both human readable[9] and machine readable[10] form,
simplifying the process of exposing data to browsers visiting a web page
and to software aggregating data for some third party application.
<h2>Thomson Reuters</h2>
With Open Calais[11], Thomson Reuters offers a free web service that may
be used to identify and extract named entities, facts and events from
text submitted to it. The service accepts unstructured text submitted in
HTML, XML and related formats, and returns a version of the text
enriched with additional structure.

Given the heritage of Thomson Reuters, the service tends to be most
relevant to business and financial applications, but it succeeds in
adding value to a wide range of resource types by extracting meaning
from text, adding structure and context, and offering links to a wealth
of supporting data from within Thomson Reuters’ databases and the third
party content of Freebase and others. A passing reference to ‘IBM’ in
text submitted to the Open Calais web service, for example, would be
recognised and create the possibility for enrichment with any or all of
the additional information known to Thomson Reuters[12] (financial
filings, board members, competitors, etc.) or any of the third party
services with which Calais shares a common identifier.

In a simple illustration, I copied the first paragraph of UKOLN’s
‘About’ page,[13] stripped out the URLs, and pasted it into the Calais
Viewer tool to achieve the result below.
<p style="text-align: center;"></p>

[caption id="attachment_61" align="aligncenter" width="416" caption="A
paragraph of text from the UKOLN web site, analysed by Open Calais"]<a
href="http://viewer.opencalais.com/"><img class="size-full wp-image-61
" title="Open Calais"
src="http://linkeddata.jiscpress.org/files/2010/02/opencalais.png"
alt="" width="416" height="325" /></a>[/caption]

The true potential lies in automated use of the api, rather than manual
pasting of demonstration text into a web page, and there is clear scope
for individual Higher Education institutions to exploit the connections
that a tool such as this identifies.
<h2>Freebase</h2>
San Francisco-based Freebase[14] is a community-maintained ‘free
database of the world’s information,’ backed by significant venture
capital.[15] Built upon proprietary database infrastructure, the site
offers straightforward tools for expressing rich semantics and structure
without directly using the specifications of W3C’s Semantic Web stack[16].

Towards the end of 2008 Freebase launched[17] a new RDF service[18] that
enabled responses to api calls to be returned in RDF, making Freebase
content available to those building Linked Data applications.
<p style="text-align: center;"></p>

[caption id="attachment_62" align="aligncenter" width="416"
caption="Freebase"]<a href="http://www.freebase.com/"><img
class="size-full wp-image-62  " title="Freebase"
src="http://linkeddata.jiscpress.org/files/2010/02/freebase.png" alt=""
width="416" height="325" /></a>[/caption]
<p style="text-align: center;"></p>

<h2>UK Government</h2>
Prime Minister Gordon Brown announced[19] in June that the UK Government
intended to make far more of their data easily available online for use
and re-use. Sir Tim Berners-Lee was drafted in to help and, far from
simply being a figurehead, became actively involved in working with a
range of Government departments to make data available online.

Some 3,000 data sets are already available on the Government’s data
site, data.hmg.gov.uk, and last month’s <em>Putting the frontline
first</em> document reiterates the promise that plenty more will follow.
As well as simplifying access to previously ‘available’ data, there has
also been success in changing attitudes to data from Ordnance Survey,
the Post Office and other agencies that previously charged significant
fees for access.
<p style="text-align: center;"></p>

[caption id="attachment_63" align="aligncenter" width="416" caption="The
UK Government data site, which went public on 21 January 2010"]<a
href="http://data.gov.uk/"><img class="size-full wp-image-63  "
title="UK Government Data"
src="http://linkeddata.jiscpress.org/files/2010/02/datagov.png" alt=""
width="416" height="325" /></a>[/caption]

In contrast to the United States’ data.gov site, which simply provides
access to raw data (Excel spreadsheets, PDF files, and more), the UK is
adhering closely to Berners-Lee’s Linked Data rules and making data
available in formats such as RDF where feasible.

-------------------
[1]
http://blogs.talis.com/nodalities/2009/01/building-coherence-at-bbccouk.php
[2] http://www.bbc.co.uk/programmes/developers
[3] http://www.bbc.co.uk/music/developers
[4]
http://derivadow.com/2009/07/28/opening-up-the-bbcs-natural-history-archive/
[5] http://musicbrainz.org/
[6] http://fanhu.bz/
[7]
http://open.blogs.nytimes.com/2009/06/26/nyt-to-release-thesaurus-and-enter-link
ed-data-cloud/
[8]
http://open.blogs.nytimes.com/2009/10/29/first-5000-tags-released-to-the-linked-
data-cloud/
[9] http://data.nytimes.com/N66220017142656459133.html
[10] http://data.nytimes.com/N66220017142656459133.rdf
[11] http://opencalais.com/
[12]
http://d.opencalais.com/er/company/ralg-tr1r/9e3f6c34-aa6b-3a3b-b221-a07aa793363
3.html
[13] http://www.ukoln.ac.uk/about/
[14] http://www.freebase.com/
[15]
http://web2innovations.com/money/2008/01/18/massive-second-round-of-funding-for-
freebase-42-million/
[16] http://en.wikipedia.org/wiki/Semantic_Web_Stack
[17] http://blog.freebase.com/2008/10/30/introducing_the_rdf_service/
[18] http://rdf.freebase.com/
[19]
http://www.guardian.co.uk/technology/2009/jun/10/berners-lee-downing-street-web-
open

Original issue reported on code.google.com by jossw...@gmail.com on 14 Feb 2011 at 9:07

GoogleCodeExporter commented 8 years ago
the current solution gives lines numbers of errors to properly privileged users

Original comment by eddie.tejeda on 22 Feb 2011 at 7:37

GoogleCodeExporter commented 8 years ago
Eddie, I've just updated to r221 and something isn't right. 

I'm now seeing lots of poorly parsed pages, complaining of the exact same issue:

Fatal Error: Entity 'rdquo' not defined Line: 10

Previously, these pages parsed fine. With the example above, when I go to line 
10, there's nothing there to fix. rdquo refers to quotes but there are no quote 
marks to even look at.

Original comment by jossw...@gmail.com on 23 Feb 2011 at 5:53

GoogleCodeExporter commented 8 years ago
ooops! try r223!

Original comment by eddie.tejeda on 23 Feb 2011 at 5:55

GoogleCodeExporter commented 8 years ago
Thanks. That's fixed it ;-)

Original comment by jossw...@gmail.com on 23 Feb 2011 at 6:02