elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
1.52k stars 24.9k forks source link

Can't free up space because there's not enough space? ;) #9260

Closed darkpixel closed 9 years ago

darkpixel commented 9 years ago

One of my nodes is 'low' on space. Down to ~16 GB.

So I tried to run curator to remove older logs and I get the following message:

[2015-01-12 11:53:50,250][INFO ][cluster.metadata         ] [tetrad] [logstash-2014.12.13] deleting index
[2015-01-12 11:53:50,251][DEBUG][action.admin.indices.delete] [tetrad] [logstash-2014.12.13] failed to delete index
java.lang.IllegalStateException: Free bytes [4518450191893] cannot be less than 0 or greater than total bytes [4509977353216]
        at org.elasticsearch.cluster.DiskUsage.<init>(DiskUsage.java:36)
        at org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDecider.canRemain(DiskThresholdDecider.java:439)
        at org.elasticsearch.cluster.routing.allocation.decider.AllocationDeciders.canRemain(AllocationDeciders.java:105)
        at org.elasticsearch.cluster.routing.allocation.AllocationService.moveShards(AllocationService.java:257)
        at org.elasticsearch.cluster.routing.allocation.AllocationService.reroute(AllocationService.java:223)
        at org.elasticsearch.cluster.routing.allocation.AllocationService.reroute(AllocationService.java:160)
        at org.elasticsearch.cluster.routing.allocation.AllocationService.reroute(AllocationService.java:146)
        at org.elasticsearch.cluster.metadata.MetaDataDeleteIndexService$2.execute(MetaDataDeleteIndexService.java:130)
        at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:329)
        at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:153)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
[2015-01-12 11:53:56,767][INFO ][cluster.routing.allocation.decider] [tetrad] low disk watermark [15%] exceeded on [Pc_MAIWOQVe6qKNtKVIYpw][zefram] free: 16.3gb[14.5%], replicas will not be assigned to this node

The actual error from curator is:

root@tetrad:~# /root/.virtualenvs/curator/bin/curator --host localhost delete --older-than 10;
2015-01-12 11:53:50,238 INFO      Job starting...
2015-01-12 11:53:50,241 INFO      Deleting indices...
Traceback (most recent call last):
  File "/root/.virtualenvs/curator/bin/curator", line 9, in <module>
    load_entry_point('elasticsearch-curator==2.1.1', 'console_scripts', 'curator')()
  File "/root/.virtualenvs/curator/local/lib/python2.7/site-packages/curator/curator_script.py", line 364, in main
    arguments.func(client, **argdict)
  File "/root/.virtualenvs/curator/local/lib/python2.7/site-packages/curator/curator.py", line 1025, in delete
    _op_loop(client, matching_indices, op=delete_index, dry_run=dry_run, **kwargs)
  File "/root/.virtualenvs/curator/local/lib/python2.7/site-packages/curator/curator.py", line 767, in _op_loop
    skipped = op(client, item, **kwargs)
  File "/root/.virtualenvs/curator/local/lib/python2.7/site-packages/curator/curator.py", line 610, in delete_index
    client.indices.delete(index=index_name)
  File "/root/.virtualenvs/curator/local/lib/python2.7/site-packages/elasticsearch/client/utils.py", line 68, in _wrapped
    return func(*args, params=params, **kwargs)
  File "/root/.virtualenvs/curator/local/lib/python2.7/site-packages/elasticsearch/client/indices.py", line 188, in delete
    params=params)
  File "/root/.virtualenvs/curator/local/lib/python2.7/site-packages/elasticsearch/transport.py", line 301, in perform_request
    status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
  File "/root/.virtualenvs/curator/local/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 82, in perform_request
    self._raise_error(response.status, raw_data)
  File "/root/.virtualenvs/curator/local/lib/python2.7/site-packages/elasticsearch/connection/base.py", line 102, in _raise_error
    raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
elasticsearch.exceptions.TransportError: TransportError(500, u'IllegalStateException[Free bytes [4518450191893] cannot be less than 0 or greater than total bytes [4509977353216]]')
root@tetrad:~# 
darkpixel commented 9 years ago

One thing I forgot to mention is that the nodes use ZFS for their storage... Maybe the error about free bytes exceeding total bytes might be somehow related to ZFS compression?

darkpixel commented 9 years ago

The workaround:

dakrone commented 9 years ago

Related to #9249, the workaround there will work for this also until it is fixed.

darkpixel commented 9 years ago

Thanks for the pointer!