neo4j-contrib / neo4j-elasticsearch

Neo4j ElasticSearch Integration
Apache License 2.0
210 stars 79 forks source link

Data not written to elasticsearch #43

Open apowers313 opened 6 years ago

apowers313 commented 6 years ago

Assuming #42 isn't the source of my problems, I'm not seeing any data being written to elasticsearch and I don't get any errors that would indicate why.

I have 6538642 nodes of type BibliographicResource which look like:

// MATCH (n:BibliographicResource) WHERE id(n) = toInteger(rand() * 6000000) RETURN n
{
  "iri": "gbr:3512409",
  "year": "2007",
  "label": "bibliographic resource 3512409 [br/3512409]",
  "title": "The Isometric Torque at Which Knee-Extensor Muscle Reoxygenation Stops",
  "record_type": "article"
}

And my configuration looks like:

elasticsearch.host_name=http://localhost:9200
elasticsearch.index_spec=br:BibliographicResource(title,iri)

I can see that the plugin loads in the log file:

2017-09-02 20:41:01.897+0000 INFO [o.n.k.i.DiagnosticsManager]   plugins:
2017-09-02 20:41:01.897+0000 INFO [o.n.k.i.DiagnosticsManager]     .DS_Store: 2017-09-02T13:40:51-0700 - 6.00 kB
2017-09-02 20:41:01.897+0000 INFO [o.n.k.i.DiagnosticsManager]     neo4j-elasticsearch-3.2.3.jar: 2017-09-02T13:40:24-0700 - 4.91 MB
2017-09-02 20:41:01.897+0000 INFO [o.n.k.i.DiagnosticsManager]   - Total: 2017-09-02T13:40:51-0700 - 4.92 MB
2017-09-02 20:41:01.897+0000 INFO [o.n.k.i.DiagnosticsManager]   schema:
2017-09-02 20:41:01.897+0000 INFO [o.n.k.i.DiagnosticsManager]     index:
2017-09-02 20:41:01.897+0000 INFO [o.n.k.i.DiagnosticsManager]       lucene:
[ ... ]

I attempt to populate the data:

MATCH (br:BibliographicResource) SET br.title = br.title, br.iri = br.iri

Set 13077247 properties, completed after 225693 ms.

But the elasticsearch index is never created / no data is populated in elasticsearch:

$ curl 'localhost:9200/_cat/indices?v'
health status index   uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   .kibana qOP15da-RaK5bfIWpjmLLA   1   1          1            0      3.2kb          3.2kb

There aren't any errors in neo4j's debug.log or elasticsearch's elasticsearch.log. Any ideas of how to fix and / or debug this problem?

SeguinBe commented 6 years ago

I have the same issue with 3.1

I actually had to modify the data for elasticsearch to be populated. a SET a.b=a.b query is not enough for elasticsearch to be notified.

jexp commented 6 years ago

Yes, neo4j doesn't write properties that haven't changed.

We could add some means (e.g. a procedure to actually trigger initial indexing).

jexp commented 6 years ago

btw. updating all 13M entries at once might also overload the plugin.

Can you try to update e.g. a subset of 100k, let's say with a timestamp?

bradeac commented 5 years ago

Are there any updates on this ? Unchanged existing data isn't sent to elasticsearch

leviwilson commented 5 years ago

@jexp would it overload that many records if one were to use apoc.periodic.commit with smaller batch sizes (since it's executeAsync anyway)?

ewc340 commented 3 years ago

Hi @jexp, is there any update on whether a procedure will be implemented to trigger initial indexing? The problem still seems to be happening.