Claudenw / PA4RDF

functionality on top of an RDF store while accounting for and exploiting the fundamental differences between graph storage and relational storage. PA4RDF introduces three (3) annotations that map a RDF triple (subject, predicate, object) to a Plain Old Java Object (POJO) using Java's dynamic proxy capabilities.
12 stars 0 forks source link

Question: how to convert a RDF TTL to a POJO? #4

Open BoraBak opened 3 years ago

BoraBak commented 3 years ago

I have a TTL file format which I want to convert/unmarshal to its relevant POJOs. Does this capability exists?

Claudenw commented 3 years ago

That is a rather broad question. However, I will make some assumptions.

  1. In your TTL there are subjects with multiple properties.
  2. You can identify subjects in the TTL that you consider an instance of an object.

The way this library works is you read the TTL into a Jena Model. You can consider this a graph with nodes connected by arcs. It should not be confused with the Jena Graph object. I will refer to the deserialised TTL file as a graph, but remember that it is a Jena Model. In the graph there are nodes that correspond to the subject nodes in your TTL. In fact they have the same URI.

You create a Java interface with methods to get/set the properties on the the subject nodes. You annotate the interface with the @Subject annotation. You annotate the setters with the @Predicate annotation.

Setters come in 2 types setX() and addX(). The setX() form indicates that there should only be one value for X associated with the subject. The addX() form indicates that there can be more than one value. If you use addX() you probably want to implement removeX() to delete values as well.

Let's call your interface MyInterface. Once you have completed MyInterface you do the following.

  1. Read the TTL into the Jena Model (graph).
  2. Create a PA4RDF EntityManager (EntityManagerFactory.getEntityManager())
  3. locate a subject resource in the graph by creating that resource in the model. Since the model already contains the resource it will return an instance of the resource (Resource r = model.createResource( "http://example.com/my/resource/uri" ).
  4. Tell the EntityManager to construct a MyInterface object from the resource (MyInterface myThing = EntityManager.read( r, MyInterface.class )).
  5. Call the functions on the myThing object like you would any instance of MyInterface.
    • All reads read from the graph (Jena Model).
    • All writes update or create new data in the graph.
      1. When you want to save the data write the graph back out to a TTL, or other graph storage.

It has been awhile since I used this code, I did start to update it for the new Jena releases, but did not complete it. It should be fairly easy to finish. I will work on it if you want to use the library.

BoraBak commented 3 years ago

First of all, thanks for the detailed reply. I'll need to read it several more times to see I understood correctly your explanation. Second, it will be really great if you do that (work on it). I'm searching for a solution through the web for my use-case, but nothing has a full end-2-end solution, i.e. input TTL file -> output POJOs.

BoraBak commented 3 years ago

I'm attaching a minor example of TTL file code. In realty, it's much much more larger.

#@prefix ns3:    <http://xmlns.com/foaf/0.1/> .
@prefix owl:    <http://www.w3.org/2002/07/owl#> .
@prefix ns0:    <http://example.org/prop#> .

ns0:Host_48d491d1-7998-59d6-b3ed-07a0868d7536_531
    a                                 ns0:Host ;
    ns0:Exploitable                   "XXE", "PEC";
    ns0:Account                       ns0:Account_0abb0c18-9fb7-57f2-8bcd-2cb07ffbe865_161 ;
    ns0:HostSignificance              "3" ;
    ns0:ID                            "48d491d1-7998-59d6-b3ed-07a0868d7536_531" .

ns0:Account_0abb0c18-9fb7-57f2-8bcd-2cb07ffbe865_161
    a              ns0:Account ;
    ns0:GroupName  "Administrators" ;
    ns0:ID         "0abb0c18-9fb7-57f2-8bcd-2cb07ffbe865_161" ;
    ns0:Privileges "SeRemoteInteractiveLogonRight" ;
    ns0:UserId     "161" ;
    ns0:Username   "administrator" .

ns0:Host
    a owl:ObjectProperty, owl:Class .

ns0:Account
    a owl:ObjectProperty, owl:DatatypeProperty, owl:Class .

ns0:Exploitable
    a owl:DatatypeProperty .

ns0:GroupName
    a owl:DatatypeProperty .

ns0:HostSignificance
    a owl:DatatypeProperty .

ns0:ID
    a owl:DatatypeProperty .

ns0:IP
    a owl:DatatypeProperty .

ns0:LocalAccountTokenFilterPolicy
    a owl:DatatypeProperty .

ns0:LsaRunAsPPL
    a owl:DatatypeProperty .

ns0:Privileges
    a owl:DatatypeProperty .

ns0:UserId
    a owl:DatatypeProperty .

ns0:Username
    a owl:DatatypeProperty .
Claudenw commented 3 years ago

You need to create an interface something like

@Subject( namespace="http://example.org/prop#" )
interface Account {

  @Predicate
  void setGroupName( Sring name );
  String getGroupName();

  @Predicate
  void setID( String ID );
  String getID();

  @Predicate
  // and so on.  Note that your UserId is a string and not an integer.  You have saved it in the TTL that way.

}

And then you load the model and execute

Resource r = model.createResource( "ns0:Account_0abb0c18-9fb7-57f2-8bcd-2cb07ffbe865_161" );
EntityManager manager = EntityManagerFactory.getEntityManager();
Account account = manager.read( r, Account.class );
// then call the Account methods on the account object
BoraBak commented 3 years ago

Hi @Claudenw, thanks again for the reply - much appreciated!

I'm not sure why, but the all (Host & Account) POJOs params seems to be null. Here's the code I wrote for the aforesaid .ttl file:

Pa4rdf.java

import com.hp.hpl.jena.rdf.model.Model;
import com.hp.hpl.jena.rdf.model.Resource;
import org.apache.jena.riot.RDFDataMgr;
import org.xenei.jena.entities.EntityManager;
import org.xenei.jena.entities.EntityManagerFactory;
import org.xenei.jena.entities.MissingAnnotation;

public class Pa4rdf {
    public static void main(String[] args) throws MissingAnnotation {
        foo();
    }

    static void foo() throws MissingAnnotation {
        Model model = RDFDataMgr.loadModel("src/main/java/hosts-test.ttl");
        EntityManager manager = EntityManagerFactory.getEntityManager();
        Resource r = model.createResource("ns0:Account_0abb0c18-9fb7-57f2-8bcd-2cb07ffbe865_161");

//        Host host = manager.read( r, Host.class );
        Account account = manager.read(r, Account.class);

        System.out.println(account.getID());
    }
}

Host.java

import org.xenei.jena.entities.annotations.Predicate;
import org.xenei.jena.entities.annotations.Subject;

@Subject(namespace = "http://example.org/prop#")
public interface Host {

    @Predicate
    void setID(String ID);

    String getID();

    @Predicate
    void setExploitable(String name);

    String getExploitable();

    @Predicate
    void setHostSignificance(String hostSignificance);

    String getHostSignificance();

//    @Predicate
//    void setAccount(String account);
//
//    String getAccount();
}

Account.java

import org.xenei.jena.entities.annotations.Predicate;
import org.xenei.jena.entities.annotations.Subject;

@Subject(namespace = "http://example.org/prop#")
public interface Account {
    @Predicate
    void setID(String ID);

    String getID();

    @Predicate
    void setGroupName(String name);

    String getGroupName();

    @Predicate
    void setPrivileges(String privileges);

    String getPrivileges();

    @Predicate
    void setUserId(String userId);

    String getUserId();

    @Predicate
    void setUsername(String username);

    String getUsername();
}
Claudenw commented 3 years ago

the problem may be

Resource r = model.createResource("ns0:Account_0abb0c18-9fb7-57f2-8bcd-2cb07ffbe865_161");

change it to

Resource r = model.createResource("http://example.org/prop#Account_0abb0c18-9fb7-57f2-8bcd-2cb07ffbe865_161");

If that doesn't work, you should probably set a property and then print out the graph. You should see the property attached to a resource. This should point you to a solution where you can find the property in the graph.

BoraBak commented 3 years ago

Hi @Claudenw, I printed the graph and extracted the property (debugged it), yet each time the pojo is empty. I tried lots of variations, each time different resource string value, but nothing worked :\ WDYF?

BoraBak commented 3 years ago

BTW, I just realized that I won't have in my TTL files the metadata of the objects and their properties. I.e. the following information will be missing:

ns0:Host
    a owl:ObjectProperty, owl:Class .

ns0:Account
    a owl:ObjectProperty, owl:DatatypeProperty, owl:Class .

ns0:Exploitable
    a owl:DatatypeProperty .

ns0:GroupName
    a owl:DatatypeProperty .

ns0:HostSignificance
    a owl:DatatypeProperty .

ns0:ID
    a owl:DatatypeProperty .

ns0:IP
    a owl:DatatypeProperty .

ns0:LocalAccountTokenFilterPolicy
    a owl:DatatypeProperty .

ns0:Privileges
    a owl:DatatypeProperty .

ns0:UserId
    a owl:DatatypeProperty .

ns0:Username
    a owl:DatatypeProperty .
Claudenw commented 3 years ago

What version of the library are you using?

Claudenw commented 3 years ago

You need to set the upcase attribute on your Predicate annotations to true. This is because your properties have upper case first letters (e.g. ns0:GroupName not ns0:groupName)

so

@Predicate( upcase=true ) instead of @Predicate

BoraBak commented 3 years ago

Great, this helped! Not sure how I could know this by myself :| Which brings me to another question, which kinda related to your solution. In case I have an array/list, how can I get all its values?

  1. For a list of literals (strings), I used type= RDFList.class. In the below example it's ns0:Exploitable
  2. For a list of objects, I also used type= RDFList.class. In the below example it's ns0:Account For example:

    ns0:Host_48d491d1-7998-59d6-b3ed-07a0868d7536_531
    a                                 ns0:Host ;
    ns0:Exploitable                   "XXE", "PEC";
    ns0:Account                       ns0:Account_161, ns0:Account_162 ;
    ...
    @Subject(namespace = "http://example.org/prop#")
    public interface Host {
    
    @Predicate(upcase=true, type= RDFList.class)
    void setExploitable(List<String> name);
    List<String> getExploitable();
    
    @Predicate(upcase=true, type= RDFList.class)
    void setAccount(List<Account> account);
    List<Account> getAccount();
    ...

The problem is, inside the list, when I get an element from there (in my case it's Account), it doesn't recognize its methods: host.getAccount().get(0).getID() -outputs-> No such instance method: 'getID'.

When I Evaluate it during Debugging, I can see 2 things:

  1. List<Account> - evaluating the following command host.getAccount().get(1) outputs http://example.org/prop#Account_162. To be more persist, it's a list of ResourceImpl, which I'm not sure what or how I need to behave with.
  2. Account - evaluating the following command host.getAccount() outputs class org.xenei.jena.entities.impl.SubjectInfoImpl[http://example.org/prop#Account_162]
BoraBak commented 3 years ago

Hi @Claudenw, any thoughts regarding the list of objects?

Claudenw commented 3 years ago

I have not had time to look, but I think that the docs say that it does not support lists. However, I think I implemented it on one of the branches awhile back.

On Tue, Aug 31, 2021 at 10:08 AM BoraBak @.***> wrote:

Hi @Claudenw https://github.com/Claudenw, any thoughts regarding the list of objects?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Claudenw/PA4RDF/issues/4#issuecomment-909048157, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASTVHWKAOGVDIRBI35GYWTT7SLYTANCNFSM5CZZ4TPA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

-- I like: Like Like - The likeliest place on the web http://like-like.xenei.com LinkedIn: http://www.linkedin.com/in/claudewarren

BoraBak commented 3 years ago

OK, I understand. Well if you'll have time to look for it at your branches, it would be great.

Claudenw commented 3 years ago

The simplest way to retrieve the list in the current implementation is for the method to return the Jena Resource and then use the r.as( RDFList.class ) method to return an instance of the RDList object. From that you can get the Java List.