HTTP-APIs / hydra-python-agent

Python Hydra smart client and test console
MIT License
22 stars 42 forks source link

Considering implementing (TPF) fragments #3

Open Mec-iS opened 6 years ago

Mec-iS commented 6 years ago

https://github.com/HTTP-APIs/hydrus/issues/175

(Fragments are not supported by the official Hydra spec)

NOTE: resolving this issue would be off-specs but actually useful for inner working of the client

ashwani99 commented 6 years ago

@Mec-iS @xadahiya @chirag-jn Can you give me some points on what features of linked data fragments needs to be added to the client. There is already an implementation of Triple Pattern Fragments here https://github.com/HTTP-APIs/python-hydra-agent/blob/master/hydra/tpf.py Do we need to support other types of linked data fragments like data dump(for local queries), subject page as TPF has the best balance between efficient client queries and server cost?

Mec-iS commented 6 years ago

I don't know yet if the client should be allowed to query by fragment like : http://localhost/api/someClass/<id>#someProperty and translate to a server-supported query like http://localhost/api/someClass/<id>?prop=someProperty. But it can be a possibility.

For now we just implement the basic functionalities: the client needs to subset the dataset, so it defines a procedure to query the server for that subset of dataset. See hydra-py for implementation.

ashwani99 commented 6 years ago

@Mec-iS Thanks for the reply. So for querying specific fragments instead of translating the query and sending it to server, we can just load the data along with vocab on the client machine and query from it. For example, we can write functions like find_property(), serialize() for Class class to find out required SupportedProperty object and serialize it.

Now loading all the dataset into memory(current implementation loads each graph node lazily) for querying purpose would be highly inefficient, a solution for this can be to expand the JSON-LD with the help of a processor like pyld and use regular expressions to find the required IRI and then dynamically load it into memory to answer the user queries as well as interact with the object. That way it is light on memory as well as the client can be able to find required information quickly using regex.

Can you please suggest if my approach is correct?

Mec-iS commented 6 years ago

[EDITED]

Take hydra-py as a reference but also consider that the HYDRA Draft evolved considerably since then. hydra-py is great because it gives the basics of how RDF works.

Don't get confused between Fragments and Triple Pattern Fragments (TPF). This issue is only about Fragments (the possibility of sub-setting an object by addressing one of its properties), we want to find out if it is useful and how to implement Fragments in hydrus. We have an entire task (the Querying task) about TPF.

The architecture for this Fragments feature is supposed to work like:

  1. The client represents the subsets of the dataset as a fragment http://localhost/api/someClass/<id>#someProperty (it may receive a request like this from another machine-client or from a JS client)
  2. The client passes the query to the server in a server-understandable format http://localhost/api/someClass/<id>?prop=someProperty
  3. The server parse the fragment into its internal querying language (the Querying task we are going to work on this Summer)
  4. The server responds with a JSON-LD that include the @id and @type of the object and the someProperty with all its linked metadata and its data.

Part 3-4 are meant to be implemented by https://github.com/HTTP-APIs/hydrus/issues/174

Always remember that in hydrus client and server works on the same machine to respond requests from other client and issue requests to other servers. It is always a network of "clients-servers" that ask each others to retrieve the right data. So all these interactions need protocols at different levels to work.

See this link for more definitions.

py-ranoid commented 6 years ago

Would this involve adding an IRI Template to the API (Class) Documentation ? And hence additions to doc_writer.py ?

Mec-iS commented 6 years ago

~This is closed as not relevant. Nice discussion though.~

We are back to consider TPF for GSOC-2019