cyberborean / rdfbeans

Java persistence with RDF
Other
11 stars 12 forks source link

Bad performance of method create() #36

Closed twallmey closed 5 years ago

twallmey commented 5 years ago

Hi,

I've encountered a serious performance problem within the following code snippet:

Resource beanResource = this.manager.getResource(iri, beanType, subgraphResources); //search for requested bean in subgraph(s) result = this.manager.create(beanResource, beanType, subgraphResources); //this takes up to two seconds

Can you explain why calling create() for my annotated interface takes so much time? Is there a solution or am I doing something wrong?

Thanks in advance,

Thorben

cyberborean commented 5 years ago

Hi,

It's weird. Running your code snippet 1 million times in a loop takes 1.9 sec with MemoryStore and 2.4 sec with NativeStore (RDF4J 2.4).

I believe that there are factors in your application code and/or triplestore configuration. Is it possible to isolate the issue in a unit test reconstructing the context where it can be reproduced?

twallmey commented 5 years ago

Hi,

thanks a lot for your quick response. I've further broken down the issue into details. You are right: creation of (proxy) beans within a MemoryStore is not a big issue. Although I'm running everything on a 'normal laptop' a creation of bean never takes longer than 5 millis. In fact, the problem seems to arise from setting the bean properties. I've to populate a bean by calling 192 setters which takes about 5 seconds in average. Is this normal or could you give me a hint to increase performance significantly.

cyberborean commented 5 years ago

I think that bad setters performance is related to this issue:

https://github.com/eclipse/rdf4j/issues/1425 (Improve performance of statement removal in Native Store)

Fortunately, there are some improvements, reportedly done in RDF4J 2.5.2, so it would make sense to test how it works with the updated version.

cyberborean commented 5 years ago

I updated RDF4J dependency version in the develop

twallmey commented 5 years ago

just a stupid question: this version is not available via maven, correct? At least I am not aware of the correct maven dependency/artifact_id.

So I assume I have to pull the source code of this version, compile it and than link the resulting jar file into my project manually? In this case I'm going to ran into trouble because I've linked an older version of RDF4J via my pom.xml.

Could you give me a hint how to solve this conflict best?

cyberborean commented 5 years ago

2.3-SNAPSHOT is updated on Maven repository

twallmey commented 5 years ago

even after updating to 2.3-SNAPSHOT setting all properties of my bean (192 setters) within a NativeStore takes up to 15 seconds. That's a pitty - do you have any further ideas?

Update: the previous performance information refers to a process that does not make use of any customized transaction handling. When starting and committing a transaction for each bean to be imported the runtime decreases significantly to 5 seconds.

cyberborean commented 5 years ago

Please check that you have updated your RDF4J NativeStore version as well.

The fix is NativeStore-specific and RDFBeans does not have a dependency on it.

twallmey commented 5 years ago

Unfortunately, using RF4J NativeStore permanently is not an option. We are using the free version of graphdb.

Nevertheless I've moved our graphdb instance to a dedicated server now. This slightly decreased the time of persisting a single bean instance (with its 192 properties) once more. I takes ~3 seconds in average now. That's slower than I would expect it to be but I'm fine with that at this moment.

If you have any addiotional hints - I'm looking forward to your response. Otherwise we can close this issue.

cyberborean commented 5 years ago

It makes sense to check which RDF4J version your graphdb instance is built upon. It might be affected by the same NativeStore performance issue fixed in 2.5.2.

twallmey commented 5 years ago

Thanks for your hint. Our graphdb (8.9) seems to be built on top of RDF4J 2.4.6:

./lib/rdf4j-queryresultio-text-2.4.6.jar
./lib/rdf4j-rio-api-2.4.6.jar
./lib/rdf4j-rio-turtle-2.4.6.jar
./lib/rdf4j-repository-contextaware-2.4.6.jar
./lib/rdf4j-repository-manager-2.4.6.jar
./lib/rdf4j-rio-trig-2.4.6.jar
./lib/rdf4j-queryresultio-sparqljson-2.4.6.jar
./lib/rdf4j-util-2.4.6.jar
./lib/rdf4j-sail-api-2.4.6.jar
./lib/rdf4j-queryparser-sparql-2.4.6.jar
./lib/rdf4j-queryrender-2.4.6.jar
./lib/rdf4j-rio-binary-2.4.6.jar
./lib/rdf4j-rio-ntriples-2.4.6.jar
./lib/rdf4j-sail-model-2.4.6.jar
./lib/rdf4j-console-2.4.6.jar
./lib/rdf4j-sail-inferencer-2.4.6.jar
./lib/rdf4j-rio-rdfjson-2.4.6.jar
./lib/rdf4j-queryalgebra-geosparql-2.4.6.jar
./lib/rdf4j-rio-n3-2.4.6.jar
./lib/rdf4j-repository-http-2.4.6.jar
./lib/rdf4j-config-2.4.6.jar
./lib/rdf4j-queryparser-api-2.4.6.jar
./lib/rdf4j-query-2.4.6.jar
./lib/rdf4j-spin-2.4.6.jar
./lib/rdf4j-rio-datatypes-2.4.6.jar
./lib/rdf4j-queryalgebra-model-2.4.6.jar
./lib/rdf4j-http-server-spring-2.4.6.jar
./lib/rdf4j-rio-languages-2.4.6.jar
./lib/rdf4j-queryresultio-sparqlxml-2.4.6.jar
./lib/rdf4j-http-protocol-2.4.6.jar
./lib/rdf4j-queryalgebra-evaluation-2.4.6.jar
./lib/rdf4j-sail-memory-2.4.6.jar
./lib/rdf4j-rio-trix-2.4.6.jar
./lib/rdf4j-sail-base-2.4.6.jar
./lib/rdf4j-client-2.4.6.jar
./lib/rdf4j-sail-nativerdf-2.4.6.jar
./lib/rdf4j-rio-jsonld-2.4.6.jar
./lib/rdf4j-queryresultio-api-2.4.6.jar
./lib/rdf4j-repository-api-2.4.6.jar
./lib/rdf4j-sail-spin-2.4.6.jar
./lib/rdf4j-http-client-2.4.6.jar
./lib/rdf4j-rio-nquads-2.4.6.jar
./lib/rdf4j-model-2.4.6.jar
./lib/rdf4j-repository-dataset-2.4.6.jar
./lib/rdf4j-repository-sparql-2.4.6.jar
./lib/rdf4j-rio-rdfxml-2.4.6.jar
./lib/rdf4j-repository-sail-2.4.6.jar
./lib/rdf4j-repository-event-2.4.6.jar
./lib/rdf4j-sail-federation-2.4.6.jar
./lib/rdf4j-queryparser-serql-2.4.6.jar
./lib/rdf4j-queryresultio-binary-2.4.6.jar

Do you know if I can just change these libs or do I have to install a completely new version of graphdb that was built on top of new RDF4J libs?

cyberborean commented 5 years ago

I guess that graphdb should be recompiled with newer libraries, so you would need new graphdb version. But it's better to ask graphdb developers.

twallmey commented 5 years ago

Switched to updated version of graphdb. From my point of view performance is slightly better than before.