strapdata / elassandra

Elassandra = Elasticsearch + Apache Cassandra
http://www.elassandra.io
Apache License 2.0
1.72k stars 199 forks source link

Partitioned indexes is not working after mapping is changed #219

Closed evgeniagusakova closed 5 years ago

evgeniagusakova commented 6 years ago

Hi Team!

I'm not really sure is behavior described below is expected. I checked ES docs and it looks like it does not work in a proper way.

Could you please check and advise ?

Elassandra version:

# docker images | grep elassandra
strapdata/elassandra   5.5.0.8             ab487365d6e5        8 months ago        514MB

JVM version (java -version):

openjdk version "1.8.0_131"
OpenJDK Runtime Environment (build 1.8.0_131-8u131-b11-1~bpo8+1-b11)
OpenJDK 64-Bit Server VM (build 25.131-b11, mixed mode)

OS version (uname -a if on a Unix-like system):

# uname  -a
Linux e0318757669f 4.4.0-1062-aws #71-Ubuntu SMP Fri Jun 15 10:07:39 UTC 2018 x86_64 GNU/Linux

(docker-based installation, host is Ubuntu 16.04)

Description of the problem including expected versus actual behavior:

When I changed index mapping, partitioned indexed does not work anymore until main index is re-created.

Steps to reproduce: Details: Hostname: cassandra for Elassandre node.

To reproduce issue I use docker but I faced with it on 3 node non-docker Elassandra cluster. Affected versions: 5.5.0.4, 5.5.0.8 (other not tested)

I have key space stories_development with table depot:

CREATE TABLE stories_development.depot (
    uuid text PRIMARY KEY,
    analyst_rating boolean,
    author_followers_count int,
    author_id int,
    author_image_url text,
...
CREATE CUSTOM INDEX elastic_depot_idx ON stories_development.depot () USING 'org.elassandra.index.ExtendedElasticSecondaryIndex';
  1. Create index in ES:
curl \
    -X 'PUT' \
    http://cassandra:9200/stories_development  \
    -H "Content-Type: application/json" \
    -d \
'
{
    "settings": {
         "keyspace": "stories_development"
    },
    "mappings": {
      "depot": {
        "properties": {
          "x_frame_options": {
            "cql_collection": "singleton",
            "type": "keyword"
          },
          "uuid": {
            "cql_primary_key_order": 0,
            "cql_partition_key": true,
            "cql_collection": "singleton",
            "type": "keyword"
          },
          "type": {
            "cql_collection": "singleton",
            "type": "keyword"
          },
...

Reply:

{
  "shards_acknowledged": true,
  "acknowledged": true
}
  1. Create partitioned index:

    curl \
    -X 'PUT' \
    http://cassandra:9200/stories_development_2018_08_12/ \
    -H "Content-Type: application/json" \
    -d \
    '
    {
    "settings": {
        "keyspace": "stories_development",
        "partition_function_class": "org.elassandra.index.StringPartitionFunction",
        "partition_function": "byymd stories_development_%1$tY_%1$tm_%1$td publish_date"
    },
    "mappings": {
      "depot": {
        "properties": {
          "x_frame_options": {
            "cql_collection": "singleton",
            "type": "keyword"
          },
          "uuid": {
            "cql_primary_key_order": 0,
            "cql_partition_key": true,
            "cql_collection": "singleton",
            "type": "keyword"
          },
    ...

    Reply:

    {
    "shards_acknowledged": true,
    "acknowledged": true
    }
  2. Check how data is indexed:

    curl cassandra:9200/_cat/indices
    green open stories_development            ewiUqjqnRsKGiyiMfeK6fA 1 0 0 0 103b 103b
    green open stories_development_2018_08_12 i5csCA9eSjaj8px2LEPDnQ 1 0  0 0  103b  103b
  3. Insert some data into depot table

    
    insert into stories_development.depot (uuid, publish_date, related_groups) values ('1','2018-08-12',[{group_name: 'ss'}, {group_name: 'aa'}]);
    ...
    insert into stories_development.depot (uuid, publish_date, related_groups) values ('<OTHER INCREMENTAL IDS>','2018-08-12',[{group_name: 'ss'}, {group_name: 'aa'}]);

5. Recheck indexes: see partitioned index is populated 

curl cassandra:9200/_cat/indices green open stories_development ewiUqjqnRsKGiyiMfeK6fA 1 0 0 0 103b 103b green open stories_development_2018_08_12 i5csCA9eSjaj8px2LEPDnQ 1 0 17 0 8.2kb 8.2kb


6. Update main (non-partitioned) index mapping 
Added  "related_groups" section only

curl \ -X 'DELETE' \ http://cassandra:9200/stories_development | jq . { "acknowledged": true }

curl \ -X 'PUT' \ http://cassandra:9200/stories_development \ -H "Content-Type: application/json" \ -d \ ' { "settings": { "keyspace": "stories_development" }, "mappings": { "depot": { "properties": { "x_frame_options": { "cql_collection": "singleton", "type": "keyword" }, "related_groups": { "properties": { "group_type": { "type": "text" }, "group_name": { "type": "text" }, "group_level": { "type": "integer" } }, "cql_udt_name": "depot_group" }, "uuid": { "cql_primary_key_order": 0, "cql_partition_key": true, "cql_collection": "singleton", "type": "keyword" }, ...

{ "shards_acknowledged": true, "acknowledged": true }

For now old partitioned index is broken as expected:
Insert data:

cqlsh> insert into stories_development.depot (uuid, publish_date, related_groups) values ('38','2018-08-12',[{group_name: 'ss'}, {group_name: 'aa'}]);

Check it was not indexed:

curl cassandra:9200/_cat/indices green open stories_development sAftwlgqTPqXQSlJ-UhK2g 1 0 0 0 103b 103b green open stories_development_2018_08_12 i5csCA9eSjaj8px2LEPDnQ 1 0 17 0 8.2kb 8.2kb

As you can see above, index is not changed.

6. Create new partitioned index with other name (with b- prefix in my case), based on new mapping
(add related_groups section )

curl \ -X 'PUT' \ http://cassandra:9200/b_stories_development_2018_08_12/ \ -H "Content-Type: application/json" \ -d \ ' { "settings": { "keyspace": "stories_development", "partition_function_class": "org.elassandra.index.StringPartitionFunction", "partition_function": "b_byymd b_storiesdevelopment%1$tY%1$tm%1$td publish_date" }, "mappings": { "depot": { "properties": { "x_frame_options": { "cql_collection": "singleton", "type": "keyword" }, "related_groups": { "properties": { "group_type": { "type": "text" }, "group_name": { "type": "text" }, "group_level": { "type": "integer" } }, "cql_udt_name": "depot_group" }, "uuid": { "cql_primary_key_order": 0, "cql_partition_key": true, "cql_collection": "singleton", "type": "keyword" }, ...

{ "shards_acknowledged": true, "acknowledged": true }

curl cassandra:9200/_cat/indices green open stories_development sAftwlgqTPqXQSlJ-UhK2g 1 0 0 0 103b 103b green open b_stories_development_2018_08_12 qq9jjAedQ2iK6e03hyqusw 1 0 0 0 103b 103b green open stories_development_2018_08_12 i5csCA9eSjaj8px2LEPDnQ 1 0 17 0 8.2kb 8.2kb

New Index is empty. (as expected)

7.  Reindex 

nodetool rebuild_index stories_development depot elastic_depot_idx && nodetool compactionstats pending tasks: 0


Expected result:  new index is populated. 

Actual result: new index is still empty

curl cassandra:9200/_cat/indices

green open stories_development sAftwlgqTPqXQSlJ-UhK2g 1 0 0 0 103b 103b green open b_stories_development_2018_08_12 qq9jjAedQ2iK6e03hyqusw 1 0 0 0 103b 103b green open stories_development_2018_08_12 i5csCA9eSjaj8px2LEPDnQ 1 0 17 0 8.2kb 8.2kb


Workaround (**WITH DOWNTIME**):
1. Close old partitioned index (indexes)

curl -XPOST \ "cassandra:9200/storiesdevelopment*/_close" {"acknowledged":true}

curl cassandra:9200/_cat/indices

green open stories_development sAftwlgqTPqXQSlJ-UhK2g 1 0 0 0 103b 103b green open b_stories_development_2018_08_12 qq9jjAedQ2iK6e03hyqusw 1 0 0 0 103b 103b close stories_development_2018_08_12 i5csCA9eSjaj8px2LEPDnQ

2. Re-create main index

curl \ -X 'DELETE' \ http://cassandra:9200/stories_development | jq .

{ "acknowledged": true }

curl \ -X 'PUT' \ http://cassandra:9200/stories_development \ -H "Content-Type: application/json" \ -d \ ' { "settings": { "keyspace": "stories_development" }, "mappings": { "depot": { "properties": { "x_frame_options": { "cql_collection": "singleton", "type": "keyword" }, "related_groups": { "properties": { "group_type": { "type": "text" }, "group_name": { "type": "text" }, "group_level": { "type": "integer" } }, "cql_udt_name": "depot_group" }, "uuid": { "cql_primary_key_order": 0, "cql_partition_key": true, "cql_collection": "singleton", "type": "keyword" }, ...

{ "shards_acknowledged": true, "acknowledged": true }

Data is indexed, new index is populated:

curl cassandra:9200/_cat/indices

green open stories_development c0Il2MDGQRmr4SclZKI2Rg 1 0 0 0 103b 103b green open b_stories_development_2018_08_12 qq9jjAedQ2iK6e03hyqusw 1 0 19 0 5.7kb 5.7kb close stories_development_2018_08_12 i5csCA9eSjaj8px2LEPDnQ

vroyer commented 6 years ago

Hi Evgenia, Thanks for reporting the issue. Would you have the system.logs when the mapping update is done, ideally with the DEBUG level enabled for class org.elassandra.index ? Thanks.

evgeniagusakova commented 6 years ago

Hi @vroyer !

Thank you for reply!

I changed level for org.elassandra.indices class from INFO to DEBUG in logback.xml:

  <logger name="org.elassandra.indices" level="DEBUG" />

and repeat all steps.

When I deleted main index with old mapping and re-create it with new I found the following messages in log file:

2018-08-15 18:31:17,387 DEBUG [elasticsearch[172.18.0.4][clusterService#updateTask][T#1]] CassandraSecondaryIndicesApplier.java:78 applyClusterState Creating secondary indices for table=stories_development.depot with class=org.elassandra.index.ExtendedElasticSecondaryIndex
2018-08-15 18:31:17,420 ERROR [SecondaryIndexManagement:3] ElasticSecondaryIndex.java:2060 initMapping Failed to update mapping index=[stories_development.depot]
java.lang.NullPointerException: null
    at org.elassandra.index.ElasticSecondaryIndex$ImmutableMappingInfo.<init>(ElasticSecondaryIndex.java:1212)
    at org.elassandra.index.ElasticSecondaryIndex.initMapping(ElasticSecondaryIndex.java:2056)
    at org.elassandra.index.ElasticSecondaryIndex.lambda$getInitializationTask$5(ElasticSecondaryIndex.java:2114)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81)
    at java.lang.Thread.run(Thread.java:748)
2018-08-15 18:31:17,421 WARN  [SecondaryIndexManagement:3] ElasticSecondaryIndex.java:2034 isIndexing No Elasticsearch index ready
2018-08-15 18:31:17,483 WARN  [elasticsearch[172.18.0.4][clusterService#updateTask][T#1]] ClusterService.java:1561 lambda$publishAndApplyChanges$3 failed to notify ClusterStateListener
java.lang.NullPointerException: null
    at org.elassandra.index.ElasticSecondaryIndex.clusterChanged(ElasticSecondaryIndex.java:2071)
    at org.elasticsearch.cluster.service.ClusterService.lambda$publishAndApplyChanges$3(ClusterService.java:1559)
    at java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
    at java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:742)
    at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
    at org.elasticsearch.cluster.service.ClusterService.publishAndApplyChanges(ClusterService.java:1556)
    at org.elasticsearch.cluster.service.BaseClusterService.runTasks(BaseClusterService.java:583)
    at org.elasticsearch.cluster.service.BaseClusterService$ClusterServiceTaskBatcher.run(BaseClusterService.java:263)
    at org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150)
    at org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188)
    at org.elasticsearch.cluster.service.BaseClusterService$ClusterServiceTaskBatcher$UpdateTask.run(BaseClusterService.java:266)
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569)
    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:247)
    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:210)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:748)

When I tried to insert into depot table I got the warning afret mapping updating I got the following:

2018-08-15 18:37:02,887 WARN  [MutationStage-2] ElasticSecondaryIndex.java:2034 isIndexing No Elasticsearch index ready

When I created partitioned index with b_ part with new mapping I got the message:

2018-08-15 18:38:42,861 DEBUG [elasticsearch[172.18.0.4][clusterService#updateTask][T#1]] CassandraSecondaryIndicesApplier.java:78 applyClusterState Creating secondary indices for table=stories_development.depot with class=org.elassandra.index.ExtendedElasticSecondaryIndex
2018-08-15 18:38:42,953 WARN  [elasticsearch[172.18.0.4][clusterService#updateTask][T#1]] ClusterService.java:1561 lambda$publishAndApplyChanges$3 failed to notify ClusterStateListener
java.lang.NullPointerException: null
    at org.elassandra.index.ElasticSecondaryIndex.clusterChanged(ElasticSecondaryIndex.java:2071)
    at org.elasticsearch.cluster.service.ClusterService.lambda$publishAndApplyChanges$3(ClusterService.java:1559)
    at java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
    at java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:742)
    at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
    at org.elasticsearch.cluster.service.ClusterService.publishAndApplyChanges(ClusterService.java:1556)
    at org.elasticsearch.cluster.service.BaseClusterService.runTasks(BaseClusterService.java:583)
    at org.elasticsearch.cluster.service.BaseClusterService$ClusterServiceTaskBatcher.run(BaseClusterService.java:263)
    at org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150)
    at org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188)
    at org.elasticsearch.cluster.service.BaseClusterService$ClusterServiceTaskBatcher$UpdateTask.run(BaseClusterService.java:266)
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569)
    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:247)
    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:210)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:748)

When I'm trying to rebuild index I got the following:

2018-08-15 18:41:54,909 INFO  [RMI TCP Connection(2)-127.0.0.1] ColumnFamilyStore.java:829 rebuildSecondaryIndex User Requested secondary index re-build for stories_development/depot with 1 threads, indexes: elastic_depot_idx
2018-08-15 18:41:54,910 WARN  [RMI TCP Connection(2)-127.0.0.1] ElasticSecondaryIndex.java:2034 isIndexing No Elasticsearch index ready
2018-08-15 18:41:54,910 INFO  [RMI TCP Connection(2)-127.0.0.1] SecondaryIndexManager.java:261 rebuildIndexesBlocking No defined indexes with the supplied names: elastic_depot_idx

When I closed storiesdevelopment* indexses I got the same warning:

.... <skipped some output>......

WARN  [elasticsearch[172.18.0.4][clusterService#updateTask][T#1]] ClusterService.java:1561 lambda$publishAndApplyChanges$3 failed to notify ClusterStateListener
java.lang.NullPointerException: null
    at org.elassandra.index.ElasticSecondaryIndex.clusterChanged(ElasticSecondaryIndex.java:2071)
.... <skipped some output>......

Re-created main index

.... <skipped some output>......

2018-08-15 18:46:42,006 DEBUG [elasticsearch[172.18.0.4][clusterService#updateTask][T#1]] CassandraSecondaryIndicesApplier.java:78 applyClusterState Creating secondary indices for table=stories_development.depot with class=org.elassandra.index.ExtendedElasticSecondaryIndex
.... <skipped some output>......
2018-08-15 18:46:42,017 INFO  [SecondaryIndexManagement:4] SecondaryIndexManager.java:375 buildIndexesBlocking Submitting index build of elastic_depot_idx for data in BigTableReader(path='/var/lib/cassandra/data/stories_development/depot-0a0d60e09d7a11e88569bb51378f83a7/mc-13-big-Data.db'),BigTableReader(path='/var/lib/cassandra/data/stories_development/depot-0a0d60e09d7a11e88569bb51378f83a7/mc-12-big-Data.db')
2018-08-15 18:46:42,111 INFO  [SecondaryIndexManagement:4] ElasticSecondaryIndex.java:2153 lambda$getBlockingFlushTask$6 Elasticsearch index=[b_stories_development_2018_08_12] flushed, duration=30ms
2018-08-15 18:46:42,112 INFO  [SecondaryIndexManagement:4] SecondaryIndexManager.java:395 buildIndexesBlocking Index build of elastic_depot_idx complete

After recreating run nodetool rebuild_index and seems everything is working properly:

2018-08-15 18:46:42,017 INFO  [SecondaryIndexManagement:4] SecondaryIndexManager.java:375 buildIndexesBlocking Submitting index build of elastic_depot_idx for data in BigTableReader(path='/var/lib/cassandra/data/stories_development/depot-0a0d60e09d7a11e88569bb51378f83a7/mc-13-big-Data.db'),BigTableReader(path='/var/lib/cassandra/data/stories_development/depot-0a0d60e09d7a11e88569bb51378f83a7/mc-12-big-Data.db')
2018-08-15 18:46:42,111 INFO  [SecondaryIndexManagement:4] ElasticSecondaryIndex.java:2153 lambda$getBlockingFlushTask$6 Elasticsearch index=[b_stories_development_2018_08_12] flushed, duration=30ms
2018-08-15 18:46:42,112 INFO  [SecondaryIndexManagement:4] SecondaryIndexManager.java:395 buildIndexesBlocking Index build of elastic_depot_idx complete
2018-08-15 18:49:10,219 INFO  [RMI TCP Connection(4)-127.0.0.1] ColumnFamilyStore.java:829 rebuildSecondaryIndex User Requested secondary index re-build for stories_development/depot with 1 threads, indexes: elastic_depot_idx
2018-08-15 18:49:10,220 INFO  [RMI TCP Connection(4)-127.0.0.1] SecondaryIndexManager.java:375 buildIndexesBlocking Submitting index build of elastic_depot_idx for data in BigTableReader(path='/var/lib/cassandra/data/stories_development/depot-0a0d60e09d7a11e88569bb51378f83a7/mc-13-big-Data.db'),BigTableReader(path='/var/lib/cassandra/data/stories_development/depot-0a0d60e09d7a11e88569bb51378f83a7/mc-12-big-Data.db')
2018-08-15 18:49:10,253 INFO  [RMI TCP Connection(4)-127.0.0.1] ElasticSecondaryIndex.java:2153 lambda$getBlockingFlushTask$6 Elasticsearch index=[b_stories_development_2018_08_12] flushed, duration=19ms
2018-08-15 18:49:10,254 INFO  [RMI TCP Connection(4)-127.0.0.1] SecondaryIndexManager.java:395 buildIndexesBlocking Index build of elastic_depot_idx complete
evgeniagusakova commented 6 years ago

Full system.log (was truncated before repeat test)

system.log

evgeniagusakova commented 6 years ago

I've added <logger name="org.elassandra.index" level="DEBUG" /> to the logback.xml and repeated all steps again but there were no additional changes in logs.

vroyer commented 6 years ago

Hi Evgenia, Thank's for your logs and explanation.

The NullPointerException at org.elassandra.index.ElasticSecondaryIndex$ImmutableMappingInfo.(ElasticSecondaryIndex.java:1212) was fixed in 5.5.0.9. Could you please upgrade to a newer version or to the last version 5.5.0.20 and test again ?

Moreover, if you don't want to index twice the same data, you could set "index":"false" in the mapping of the non-partitioned index. Thanks.

vroyer commented 6 years ago

Hi Evgenia, The publish_date field is well defined in your partitioned index ?

evgeniagusakova commented 6 years ago

Hi @vroyer !

Thank you for reply!

I tried to upgrade elassandra cluster from 5.5.0.4 to 5.5.0.20 and faced with issue: node is not able to start after upgrade.

Test cluster is 2 nodes. Upgrading steps:

  1. Stopped all elassandra services on all nodes in cluster. systemctl stop cassandra

  2. Downloaded and installed new package: dpkg -i elassandra-5.5.0.20.deb. Config files are not changed.

  3. Started elassandra services on first node. Started w/o errors.

    nodetool  status
    Datacenter: dc1
    ===============
    Status=Up/Down
    |/ State=Normal/Leaving/Joining/Moving
    --  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
    UN  192.168.56.101  153.14 KiB  256          100.0%            215bbef2-f649-429b-99b2-d7f892b9e699  rack1
    DN  192.168.56.102  ?          256          100.0%            464f6eed-d4ee-497a-94d7-004d5c23070c  rack1
  4. Started cassandra on second node and got the following error:

    2018-08-18 18:42:00,004 INFO  [main] CassandraDiscovery.java:152 doStart localNode name=192.168.56.102 id=464f6eed-d4ee-497a-94d7-004d5c23070c localAddress=/192.168.56.102 publish_host=192.168.56.102:9300
    2018-08-18 18:42:00,014 ERROR [main] ElassandraDaemon.java:588 main Exception
    java.lang.NullPointerException: null
    at org.elassandra.discovery.CassandraDiscovery.updateClusterGroupsFromGossiper(CassandraDiscovery.java:285)
    at org.elassandra.discovery.CassandraDiscovery.doStart(CassandraDiscovery.java:170)
    at org.elasticsearch.common.component.AbstractLifecycleComponent.start(AbstractLifecycleComponent.java:69)
    at org.elasticsearch.node.Node.start(Node.java:757)
    at org.apache.cassandra.service.ElassandraDaemon.activate(ElassandraDaemon.java:232)
    at org.apache.cassandra.service.ElassandraDaemon.main(ElassandraDaemon.java:551)
    2018-08-18 18:42:00,021 INFO  [StorageServiceShutdownHook] HintsService.java:220 pauseDispatch Paused hints dispatch
    2018-08-18 18:42:00,029 INFO  [StorageServiceShutdownHook] Server.java:176 close Stop listening for CQL clients
    2018-08-18 18:42:00,030 INFO  [StorageServiceShutdownHook] Gossiper.java:1540 stop Announcing shutdown
    2018-08-18 18:42:00,033 INFO  [StorageServiceShutdownHook] StorageService.java:2303 handleStateNormal Node /192.168.56.102 state jump to shutdown
    2018-08-18 18:42:02,053 INFO  [StorageServiceShutdownHook] MessagingService.java:984 shutdown Waiting for messaging service to quiesce
    2018-08-18 18:42:02,055 INFO  [ACCEPT-/192.168.56.102] MessagingService.java:1338 run MessagingService has terminated the accept() thread
    2018-08-18 18:42:02,266 INFO  [StorageServiceShutdownHook] HintsService.java:220 pauseDispatch Paused hints dispatch

Here are my configs cassandra.yaml node-1-cassandra.yaml.txt node-2-cassandra.yaml.txt

Also issue is 100% reproducible (I reproduced it on cluster w/o data and also reproduced with upgrade on elassandra-6.2.3.3.deb)

From my point of view it looks like separate issue, should I create one more issue?

The publish_date field is well defined in your partitioned index ?

This question is not clear for me.

publish_date field is defined in depot table:

CREATE TABLE stories_development.depot (
    uuid text PRIMARY KEY,
    analyst_rating boolean,
    author_followers_count int,
    author_id int,
    author_image_url text,
    author_name text,
    author_screen_name text,
    category int,
    computed_score int,
    domain_cached_logo_url text,
    domain_host text,
    domain_id int,
    domain_name text,
    domain_paywall boolean,
    domain_registration_required boolean,
    duplicates_count int,
    expanded_url text,
    groups list<text>,
    id int,
    images list<text>,
    lang text,
    name text,
    origin_id text,
    original_info_item_id int,
    publish_date timestamp,
    read_count int,
    related_groups list<frozen<group>>,
    scrape_description text,
    scrape_title text,
    topic_ids list<int>,
    type text,
    x_frame_options text
) WITH bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 2000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99PERCENTILE';
CREATE CUSTOM INDEX elastic_depot_idx ON stories_development.depot () USING 'org.elassandra.index.ExtendedElasticSecondaryIndex';
curl \
 cassandra:9200/b_stories_development_2018_08_12/_mapping \
 | jq
<skipped some output>.

          "publish_date": {
            "type": "date",
            "cql_collection": "singleton"
          },

<skipped some output>

Please let me know if you need additional logs.

vroyer commented 6 years ago

Hi Evgenia, Please use the release candidate 5.5.0.21 already available on docker hub, it should fix the NPE on CassandraDiscovery.updateClusterGroupsFromGossiper. Thanks.

evgeniagusakova commented 6 years ago

@vroyer Thank you for your reply and sorry for delay with answer.

I can't use docker because it is already configured cluster with data. Docker I used for tests but for now I have same issue on cluster under load.

But I can try to build my own deb package if you'll provide me with steps you are using to built it

Also, same issue with elassandra-6.2.3.3.deb - I'm not able to update my cluster to the last version of Elassandra.

vroyer commented 6 years ago

The fix is in the last debian package 6.2.3.4 available at elassandra-6.2.3.4.deb. Thanks in advance for your feedback. Vincent.