walidazizi / rdflib

Automatically exported from code.google.com/p/rdflib
Other
0 stars 0 forks source link

Add a resource oriented access to data #166

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
The attached module implements a simple Description interface to a graph. From 
the docstring:

----
It contains methods directly corresponding to those methods of the Graph 
interface that relate to reading and writing data. The difference is that a 
Description also binds a current subject, making it possible to work without 
tracking both the graph and a current subject. This makes Description "resource 
oriented", as compared to the triple orientation of the Graph API.

Resulting generators are also wrapped so that any resource reference values are 
in turn wrapped in Descriptions.
----

The module docstring also exemplifies (i.e. tests) usage of all the methods. 
(The documentation is almost twice the lines of the implementation, and both 
should be easy to understand.)

I propose to add this to core RDFLib, since it's quite common to work in this 
way. For instance the popular (Java-based) Jena API has a Resource class for 
working in a similar manner. Current W3C work on a standard RDF API (taking 
place in the RDFa working group) also moves in this direction (with it's 
"Projection" interface).

While there are other implementations (Sparta, Oort, RDFAlchemy) supporting 
various ways do work in an even more object oriented manner ("O/R mapper" 
style), this implementation is very thin and (IMHO) adheres to the general 
RDFLib design (by honoring the Graph API).

If this is accepted, I will add this implementation to the repository, and do 
any updates/refactorings you deem necessary.

Original issue reported on code.google.com by lindstr...@gmail.com on 5 Apr 2011 at 9:59

Attachments:

GoogleCodeExporter commented 8 years ago
Looks good to me! 

Why not add a "def description(self, resource)" method to graph to create on 
easily? 

And how about some unittests? 

Why not create a (named) branch!? :) 

Original comment by gromgull on 5 Apr 2011 at 12:58

GoogleCodeExporter commented 8 years ago
Great!

I was actually thinking exactly that; adding a description method to Graph 
would be neat. That would be a bit more intrusive (not in terms of code, but 
since it couples the Graph interface with this new feature), so I think strong 
consensus is mandated for it.

I think that the doctest in the description module covers the needed testing 
of it? Some unittests make sense if we continue to add the Graph.description 
method as well though.

A named branch might be good, if we have a flow for merging it in before the 
next(?) release. At least if we want to work a lot with branches for features? 
I suggest looking at e.g. "hg flow" and consider if we should adopt that 
workflow... Let's discuss that, and branch policy in general, on the list.

As a minimal start, I could just add rdflib.description though 
(self-contained with a doctest as it is). A feature branch makes a lot of 
sense if we subsequently want to tie it to the Graph as well of course. What 
would you prefer?

Original comment by lindstr...@gmail.com on 5 Apr 2011 at 2:31

GoogleCodeExporter commented 8 years ago
This looks very good, but perhaps some closer attention paid to the behaviour 
of blank nodes? When I've done similar things, I use a blank-node closure 
(bounded description) that returns a sub-graph that "has to do with" the 
subject I'm interested in.

For more high level things, I use a variant of InfixOWL together with some 
special "predicate" datatypes that give a more ORM-like feel, 
https://bitbucket.org/okfn/ordf/src/tip/ordf/vocab/owl.py and 
http://packages.python.org/ordf/odm.html I think that your implementation is 
cleaner mostly because it gives an objectish feel without also having owl 
algebra and serialisations mixed in. Once it is better understood I might 
deprecate owl.py in favour of this.

The fact of returning generators has disoriented some of our developers, I 
think generators are good and there is definitely a very good reason for having 
them there, but sometimes people are more comfortable with lists. For this 
purpose I've just made a lazy-list implementation that hopefully will make 
everybody happy... or please nobody... or something. 
https://bitbucket.org/okfn/ordf/src/tip/ordf/llist.py It may be useful to wrap 
results in this construct for ease of use.

Also the behaviour of label in the presence of language tags is not something 
I'm terribly familiar with, need to look more closely at this...

Original comment by wwai...@gmail.com on 5 Apr 2011 at 2:48

GoogleCodeExporter commented 8 years ago
Sounds good. But is there something missing for blank nodes in Description? The 
example uses two (the CV and workHistory) and all works as expected as far as I 
see it.

Description is definitely intended to be simple and very in line with how 
RDFLib (Graphs) already works. This because anything beyond that (e.g. ORM 
stuff) leads to lots of different design approaches. I hope it will make 
working with RDF in Python both a bit more pythonic and still retain the 
RDF-centric model which RDFLib provides.

(And I expect both myself and others to continue exploring designs beyond this. 
Things like projections and profiles are very interesting for instance. And OWL 
algebra of course. But this is beyond RDFLib; at least until e.g. the W3C RDF 
API takes a more final shape.)

I can understand how generators can be confusing at first, but newish Python, 
especially Python 3, relies heavily on them, so I expect awareness will grow. 
And since RDFLib Graph already uses them, Description should follow suit (as it 
does). (Not that a lazy list might not come in handy; I just think that it's 
beyond the scope of this feature.)

Original comment by lindstr...@gmail.com on 5 Apr 2011 at 4:25

GoogleCodeExporter commented 8 years ago
[deleted comment]
GoogleCodeExporter commented 8 years ago
Why do you call it "Description"?

I think surfrdf (http://code.google.com/p/surfrdf/) does something similar with 
its Resource class 
(http://code.google.com/p/surfrdf/source/browse/trunk/surf/resource/__init__.py)
.

Simple and good contribution though!

Original comment by danielmr...@gmail.com on 11 Apr 2011 at 3:43

GoogleCodeExporter commented 8 years ago
That's a good question. I went for Description since I've seen "Resource" used 
for various things in other RDF APIs. In e.g. Sesame, it is (perhaps oddly) the 
base class of BNode and URI (i.e. a reference, not the resource referenced).

But it might be more pedagogic to call it Resource (i.e. 
rdflib.resource.Resource). What do others think?

Original comment by lindstr...@gmail.com on 16 Apr 2011 at 12:19

GoogleCodeExporter commented 8 years ago
By the way: this code now lives in the "feature/resource-oriented-api" branch 
(since last weekend), for everyone to try out. I'd like to merge it as soon as 
we're all comfortable with it.

I'll also add a 'description' method to Graph for creating Descriptions if 
noone objects (or resource -> Resources if we go with that instead).

Original comment by lindstr...@gmail.com on 16 Apr 2011 at 12:19

GoogleCodeExporter commented 8 years ago
I have used this for a while now, and it feels solid to work with. I just 
pushed some improvements:

* The module and class is now rdflib.resource.Resource (as danielmr suggested). 
This makes sense and made the code feel more natural.
* The uriref or bnode of a Resource is now called "identifier" (was "subject"). 
Just like in Graph.
* Added a Graph.resource utility method (suggested by gromgull).

I believe this feature is ready for merging.

Original comment by lindstr...@gmail.com on 1 May 2011 at 2:16

GoogleCodeExporter commented 8 years ago
There is still "_descriptions" in there. I'd be interested to hear what kind of 
application you use it for

Original comment by danielmr...@gmail.com on 17 May 2011 at 10:35

GoogleCodeExporter commented 8 years ago
Long due update: I fixed the '_descriptions' remnant some months ago, and just 
pushed an addition to the docstring about how to subclass Resource.

An example of how I've used this in practice is at 
<http://python.court.googlecode.com/hg/court/data/coin.py>.

I'll happily merge this unless anyone objects.

Original comment by lindstr...@gmail.com on 16 Sep 2011 at 10:52

GoogleCodeExporter commented 8 years ago
This issue was closed by revision 3249203b05d3.

Original comment by lindstr...@gmail.com on 19 Oct 2011 at 9:28