leo-project / leofs

The LeoFS Storage System
https://leo-project.net/leofs/
Apache License 2.0
1.55k stars 155 forks source link

Cant delete large file #1185

Closed tnatanael closed 5 years ago

tnatanael commented 5 years ago

Hy guys, i created a simple cluster, with 2 storages, and after uploading a file with 1Gb and running the cluster for 1 week, i am not able to delete this file, the delete operation runs with success but the file persists...

What i tried: recover-node recover-disk recover-consistency

I wonder that when i put the cluster in the production env, with so many files this would be a very annoying bug, so please help me.

mocchira commented 5 years ago

@tnatanael What you'd have to do for deleting file physically is "compaction-start" as described here: http://leo-project.net/leofs/docs/admin/system_operations/data/#how-to-operate-data-compaction

please check the doc above out for more details.

tnatanael commented 5 years ago

Tried with leofs-adm compact-start, it says OK, but the file persists, even after waiting the process to finish

yosukehara commented 5 years ago

Let us know your LeoFS' error log and the state of the large object:

tnatanael commented 5 years ago

How can i discover the object name? Is it the filename of the original file?

tnatanael commented 5 years ago

image

tnatanael commented 5 years ago

when i run compact and when i try to delete the file using s3 api this error message pops on log

yosukehara commented 5 years ago

Is it the filename of the original file?

Exactly, leofs-adm whereis <file-path>

tnatanael commented 5 years ago

I tried with only filename... leofs-adm whereis 1000mb_1

And bucket + filename leofs-adm whereis teste-thiago/1000mb_1

The 2 options says [ERROR] Could not get ring

tnatanael commented 5 years ago

I am 100% sure that the file was corrupt due to disk failures, but it may need to be cleared by the manual delete or auto by the cluster in some way

yosukehara commented 5 years ago

I've understood that your LeoFS' RING (routing table) is broken. So let me know the current state of the system. Can you share the result of leofs-adm status and the operation histories to this day?

tnatanael commented 5 years ago

Sure!!! image

It is a test cluster, i am simulating a disk failure we experienced in production

How can i get the operations history?

yosukehara commented 5 years ago

How can i get the operations history?

$ history | gpre leofs-adm

tnatanael commented 5 years ago

image

Do you want a team viewer session?

yosukehara commented 5 years ago

image

I've just clearly understood that LeoManager's RING is broken. I'm going to consider how to recover the system.

tnatanael commented 5 years ago

Ok... only to state, the cluster is still working, i am uploading and removing new files right now... only this file is undeletable...

yosukehara commented 5 years ago

TO: @mocchira your opinion will be much appreciated.

tnatanael commented 5 years ago

Hy guys! Can this ticket be labelled as a bug instead of question?

yosukehara commented 5 years ago

Please do the following if you understand that we may NOT be able to restore your system completely. I considered how to recover your LeoManager’s RING as below:

Procedure:

  1. Stop the all nodes of LeoStorage, LeoGateway, and LeoManager
  2. Backup all files of LeoManager nodes (Archive LeoManager‘s directory)
  3. After data backup of LeoManager nodes, Remove Mnesia files under mnesia directory of both LeoManager nodes
  4. Start the LeoManager nodes
  5. Start the LeoStorage nodes
  6. Execute leofs-adm start
  7. Confirm the state of the system with leofs-adm status
  8. Start the LeoGateway node

If you succeeded in doing the procedure, you can execute the data-compaction command.

yosukehara commented 5 years ago

I'd like to share an example of the procedure of recovering LeoManager's RING as below.

[Example] How To Recover LeoManager's RING

Before recovery:

$ leofs-adm status
 [System Confiuration]
-----------------------------------+----------
 Item                              | Value
-----------------------------------+----------
 Basic/Consistency level
-----------------------------------+----------
                    system version | 1.5.0
                        cluster Id | leofs_1
                             DC Id | dc_1
                    Total replicas | 2
          number of successes of R | 1
          number of successes of W | 1
          number of successes of D | 1
 number of rack-awareness replicas | 0
                         ring size | 2^128
-----------------------------------+----------
 Multi DC replication settings
-----------------------------------+----------
 [mdcr] max number of joinable DCs | 2
 [mdcr] total replicas per a DC    | 1
 [mdcr] number of successes of R   | 1
 [mdcr] number of successes of W   | 1
 [mdcr] number of successes of D   | 1
-----------------------------------+----------
 Manager RING hash
-----------------------------------+----------
                 current ring-hash |
                previous ring-hash |
-----------------------------------+----------

 [State of Node(s)]
-------+--------------------------+--------------+---------+----------------+----------------+----------------------------
 type  |           node           |    state     | rack id |  current ring  |   prev ring    |          updated at
-------+--------------------------+--------------+---------+----------------+----------------+----------------------------
  S    | storage_0@127.0.0.1      | running      |         | d5d667a6       | d5d667a6       | 2019-05-23 10:10:33 +0900
  S    | storage_1@127.0.0.1      | running      |         | d5d667a6       | d5d667a6       | 2019-05-23 10:10:33 +0900
  S    | storage_2@127.0.0.1      | running      |         | d5d667a6       | d5d667a6       | 2019-05-23 10:10:33 +0900
-------+--------------------------+--------------+---------+----------------+----------------+----------------------------

The Procedure of Recovering LeoManager's RING

1. Stop the all nodes

$ ./package/leo_manager_0/bin/leo_manager stop
ok
$ ./package/leo_manager_1/bin/leo_manager stop
ok
$ ./package/leo_gateway_0/bin/leo_gateway stop
ok
$ ./package/leo_storage_0/bin/leo_storage stop
ok
$ ./package/leo_storage_1/bin/leo_storage stop
ok
$ ./package/leo_storage_2/bin/leo_storage stop
ok

2. Archive LeoManager's directories

$ tar czf leo_manager_0_backup.ta.gz ./package/leo_manager_0/
$ tar czf leo_manager_1_backup.ta.gz ./package/leo_manager_1/

$ ls -la | grep backup.ta.gz
-rw-r--r--   1 yosukehara  staff  15435040 May 23 10:12 leo_manager_0_backup.ta.gz
-rw-r--r--   1 yosukehara  staff  15429047 May 23 10:12 leo_manager_1_backup.ta.gz

3. Remove the LeoManager's data directories

## manager_0:
$ rm -rf ./package/leo_manager_0/work/mnesia/*
## manager_1:
$ rm -rf ./package/leo_manager_1/work/mnesia/*

4. Restart the all nodes except LeoGateway's node(s)

$ ./package/leo_manager_0/bin/leo_manager start
$ ./package/leo_manager_1/bin/leo_manager start
$ ./package/leo_storage_0/bin/leo_storage start
$ ./package/leo_storage_1/bin/leo_storage start
$ ./package/leo_storage_2/bin/leo_storage start
$ leofs-adm status
 [System Confiuration]
-----------------------------------+----------
 Item                              | Value
-----------------------------------+----------
 Basic/Consistency level
-----------------------------------+----------
                    system version | 1.5.0
                        cluster Id | leofs_1
                             DC Id | dc_1
                    Total replicas | 2
          number of successes of R | 1
          number of successes of W | 1
          number of successes of D | 1
 number of rack-awareness replicas | 0
                         ring size | 2^128
-----------------------------------+----------
 Multi DC replication settings
-----------------------------------+----------
 [mdcr] max number of joinable DCs | 2
 [mdcr] total replicas per a DC    | 1
 [mdcr] number of successes of R   | 1
 [mdcr] number of successes of W   | 1
 [mdcr] number of successes of D   | 1
-----------------------------------+----------
 Manager RING hash
-----------------------------------+----------
                 current ring-hash |
                previous ring-hash |
-----------------------------------+----------

 [State of Node(s)]
-------+--------------------------+--------------+---------+----------------+----------------+----------------------------
 type  |           node           |    state     | rack id |  current ring  |   prev ring    |          updated at
-------+--------------------------+--------------+---------+----------------+----------------+----------------------------
  S    | storage_0@127.0.0.1      | attached     |         |                |                | 2019-05-23 10:14:00 +0900
  S    | storage_1@127.0.0.1      | attached     |         |                |                | 2019-05-23 10:14:03 +0900
  S    | storage_2@127.0.0.1      | attached     |         |                |                | 2019-05-23 10:14:05 +0900
-------+--------------------------+--------------+---------+----------------+----------------+----------------------------

After restarted the all nodes, execute leofs-adm start command:

$ leofs-adm start
Generating RING...
Generated RING
OK  33% - storage_0@127.0.0.1
OK  67% - storage_2@127.0.0.1
OK 100% - storage_1@127.0.0.1
OK

5. Restart LeoGateway node(s)

$ ./package/leo_gateway_0/bin/leo_gateway start

6. Confirm the RING hashes of the all nodes:

$ ./leofs-adm status
 [System Confiuration]
-----------------------------------+----------
 Item                              | Value
-----------------------------------+----------
 Basic/Consistency level
-----------------------------------+----------
                    system version | 1.5.0
                        cluster Id | leofs_1
                             DC Id | dc_1
                    Total replicas | 2
          number of successes of R | 1
          number of successes of W | 1
          number of successes of D | 1
 number of rack-awareness replicas | 0
                         ring size | 2^128
-----------------------------------+----------
 Multi DC replication settings
-----------------------------------+----------
 [mdcr] max number of joinable DCs | 2
 [mdcr] total replicas per a DC    | 1
 [mdcr] number of successes of R   | 1
 [mdcr] number of successes of W   | 1
 [mdcr] number of successes of D   | 1
-----------------------------------+----------
 Manager RING hash
-----------------------------------+----------
                 current ring-hash | d5d667a6
                previous ring-hash | d5d667a6
-----------------------------------+----------

 [State of Node(s)]
-------+--------------------------+--------------+---------+----------------+----------------+----------------------------
 type  |           node           |    state     | rack id |  current ring  |   prev ring    |          updated at
-------+--------------------------+--------------+---------+----------------+----------------+----------------------------
  S    | storage_0@127.0.0.1      | running      |         | d5d667a6       | d5d667a6       | 2019-05-23 10:14:16 +0900
  S    | storage_1@127.0.0.1      | running      |         | d5d667a6       | d5d667a6       | 2019-05-23 10:14:16 +0900
  S    | storage_2@127.0.0.1      | running      |         | d5d667a6       | d5d667a6       | 2019-05-23 10:14:16 +0900
  G    | gateway_0@127.0.0.1      | running      |         | d5d667a6       | d5d667a6       | 2019-05-23 10:14:32 +0900
-------+--------------------------+--------------+---------+----------------+----------------+----------------------------

The important thing is that the values ​​of current ring and prev ring before recovery and the values after recovery are the same - current ring: d5d667a6, prev ring: d5d667a6.

tnatanael commented 5 years ago

I'll try this tomorrow and return, but i am wondering why this happens? It appear to be an expected behaviour? Thanks for now!

tnatanael commented 5 years ago

Sorry for delay... it worked... after that procedure i was able to delete the file... Thanks!

akezakky555 commented 5 years ago

@yosukehara I tried to do follow your instructions but I found some problem. After I restart all leofs service. All User and Buckets are disappear.

After Remove mnesia folder leofs02

So I recovery mnesia folder at leo_manager Everything is back. but RING is broken again. leofs01

Can you suggest us how to fix this problem?

Thanks