mondo-project / mondo-hawk

Heterogeneous model indexing solution, based on NoSQL stores.
Eclipse Public License 2.0
17 stars 5 forks source link

Integration of Hawk with NeoEMF #66

Closed angel539 closed 6 years ago

angel539 commented 6 years ago

Hi @bluezio,

I am Ángel Mora from MISO group. Do you have integrated Hawk with NeoEMF? In case, it is. Do you have any sample code I can use in order to instantiate the index programatically?

All the best,

Ángel

agarciadom commented 6 years ago

Hi Ángel,

We haven't integrated Hawk with NeoEMF, unfortunately. We do have support for Neo4j 2.x, if that's what you are interested in. Could you explain your need for indexing into NeoEMF?

Kind regards, Antonio

angel539 commented 6 years ago

Yes, I am using NeoEMF as a model persistence for this project http://angel539.github.io/extremo/. We made an extensible mechanism to query the model we have in memory. It is not a language, it is based on a wizard.

Here you have details about the implementation: https://www.sciencedirect.com/science/article/pii/S1477842417301690

I don't know why, but when I am evaluating the performance of my queries, I am obtaining exponential results for some of my queries that should be quadratic. I am trying to get better times in the performance of my tool.

I used NeoEMF but with a MapDB backend. Do you have better experiences using neo4j?

Map<String, Object> options = MapDbOptionsBuilder.newBuilder() .directWriteCacheMany() .autocommit() .cacheIsSet() .cacheSizes() .asMap();

All the best,

Ángel

agarciadom commented 6 years ago

I would probably recommend running this through a profiler (YourKit is pretty good, and it's free for open-source projects) and seeing where the hotspots are. We've used profilers quite a bit to significantly speed up Hawk in the past.

In general, we are very happy with Neo4j. It's consistently our best performing backend. OrientDB we have it because of its EPL-friendly licensing, and Greycat we have it for its temporal querying capabilities (we can revisit old versions of the model).

To be honest, I think it'd be best to solve your persistence/performance issues at the root, rather than trying to add another layer to patch things up. It would also feel a bit odd to us to index a graph database into a graph database :-D.

angel539 commented 6 years ago

What do you mean with "at the root"? Reviewing my code (lol, sick)? instead of using indexes you mean...

Ok, great. I will try the NeoEMF backend for Neo4j...

In NeoEMF they actually obtained worse performance using Neo4j than using MapDB, because of that I focused at the beggining on that.

https://github.com/atlanmod/NeoEMF/wiki/Experimental-Results

agarciadom commented 6 years ago

I wasn't criticizing your code, it's just that those hotspots can be in unexpected places :-).

Yes, I remember reading about the results. One thing, though, is that they are using Neo4j 1.x (IIRC) instead of 2.x (like Hawk) or the latest 3.x.

angel539 commented 6 years ago

No no, I know that :) And what do you think? I would be better to use NeoEMF as a model persistence or XMI + Hawk? We do not want to cover great cases, because actually what I am doing is a modelling assistance... then, you should be able to import medium-size models but the goal is not to deal with scalability issues....

agarciadom commented 6 years ago

I can't say anything about NeoEMF, because I'm not one of the developers :-). Maybe it's something you could ask them about - perhaps it's something they can optimize away?

Regarding Hawk, we can import medium and large XMIs just fine, and if a XMI gets too large to fit in memory at once you always have the option of fragmenting the model (say, with EMF-Splitter), or we could look into an incremental XMI importer for Hawk. It's been in our TODO list for a while :-).

You could try indexing your models with Hawk using the Eclipse UI and writing your query in EOL, and seeing how it performs. I'm not sure if your query is naturally exponential-cost or not, since I haven't seen it.

angel539 commented 6 years ago

Ok, great. Thanks Antonio for your advices :)!

Cheers from Madrid,

Ángel