filecoin-project / lotus

Reference implementation of the Filecoin protocol, written in Go
https://lotus.filecoin.io/
Other
2.85k stars 1.27k forks source link

Sector stuck in FinalizeSector state with DisallowRemoteFinalize=true and all Seal/Store storages on all workers. #9946

Open rwxr-xr-x opened 1 year ago

rwxr-xr-x commented 1 year ago

Checklist

Lotus component

Lotus Version

$ lotus version

Daemon:  1.19.0+mainnet+git.64059ca87+api1.5.0
Local: lotus version 1.19.0+mainnet+git.64059ca87

Describe the Bug

Sector stuck in FinalizeSector state with DisallowRemoteFinalize=true. Even when miner and all workers have access to Seal and Store storages locally.

I know there were some similar issues like this (but looks like they are closed but not fully fixed):

Before 1.17 all worked flawlesly with remote finalyze. First, we started expiriencing this issue on 1.17+ versions. And now on 1.19.0 we still facing it too. We even have allocated a separate setup for more detailed tests because it is critical for our setups due to this bottleneck.

I know about DisallowRemoteFinalize warning that workers must have acces to seal and long term store storages. Before 1.17 we share long term only on PC2 and C2 workers and they finalize sectors as expected. But now even when long term storage shared on miner and ALL workers including WDP, WNP and all sealing workers - sectors stuck in FinalizeSector state.

To fix these sectors we need to edit miners config and set DisallowRemoteFinalize=false and restart miner. Right after it all sectors moved to long term storage successfully but they are moved by miner (by single server, single link).

Miners config part:

[Sealing]
  FinalizeEarly = false
[Storage]
  ParallelFetchLimit = 10
  AllowSectorDownload = false
  AllowAddPiece = false
  AllowPreCommit1 = false
  AllowPreCommit2 = false
  AllowCommit = false
  AllowUnseal = false
  AllowReplicaUpdate = false
  AllowProveReplicaUpdate2 = false
  AllowRegenSectorKey = false
  LocalWorkerName = ""
  Assigner = "spread"
  DisallowRemoteFinalize = true
  ResourceFiltering = "hardware"

Logging Information

No informative errors or warnings in logs.

Repo Steps

I remove all workers and leave ONLY 1 worker of each type. All of them has Local access to Seal and Store storages.

Below is how it happens during test (moment of stuck):

$ lotus-miner sectors status --log 239447

SectorID:       239447Status:         FinalizeSector
CIDcommD:       baga6ea4seaqao7s73y24kcutaosvacpdjgfe5pw76ooefnyqw4ynr3d2y6x2mpq
CIDcommR:       bagboea4b5abcby2v54vf3h2mxa3icvnm5fykqteyoryku2rsxe5waj52vi3qyzyc
Ticket:         33acca53fee5f19145ad7fac85949faaa940ebd022224ea3c175f1522f52d338
TicketH:        2460080Seed:           546b011fd20b912c05bcccbba4eb798549af00aaa04e5f8733c4d7e77b55170aSeedH:          2461520Precommit:      bafy2bzacea5dsv3ihtagncszijogtizqaciffdgpqwmyp7t3ugbvkbylkcgsm
Commit:         bafy2bzacedvikfdh4kmxwobez5pt3eha5b4o62xlji3yp6htlhzeb5kfluoqu
Deals:          [0]
Retries:        0
--------
Event Log:0.      2022-12-27 10:09:20 +0000 UTC:  [event;sealing.SectorStartCC]   {"User":{"ID":239447,"SectorType":8}}1.      2022-12-27 10:10:21 +0000 UTC:  [event;sealing.SectorPacked]    {"User":{"FillerPieces":[{"Size":34359738368,"PieceCID":{"/":"baga6ea4seaqao7s73y24kcutaosvacpdjgfe5pw76ooefnyqw4ynr3d2y6x2mpq"}}]}}2.      2022-12-27 10:10:21 +0000 UTC:  [event;sealing.SectorTicket]    {"User":{"TicketValue":"M6zKU/7l8ZFFrX+shZSfqqlA69AiIk6jwXXxUi9S0zg=","TicketEpoch":2460080}}
3.      2022-12-27 13:12:12 +0000 UTC:  [event;sealing.SectorPreCommit1]        {"User":{"PreCommit1Out":"eyJfbG90dXNfU2VhbFJhbmRvbW5lc3MiOiJNNnpLVS83bDhaRkZyWCtzaFpTZnFxbEE2OUFpSWs2andYWHhVaTlTMHpnPSIsImNvbW1fZCI6WzcsMTI2LDk1LDIyMiw1MywxOTcsMTAsMTQ3LDMsMTY1LDgwLDksMjI3LDczLDEzOCw3OCwxOTAsMjIzLDI0MywxNTYsNjYsMTgzLDE2LDE4Myw0OCwyMTYsMjM2LDEyMiwxOTksMTc1LDE2Niw2Ml0sImNvbmZpZyI6eyJpZCI6InRyZWUtZCIsInBhdGgiOiIvRklMRUNPSU4vU0VBTC9OVU1BMC9jYWNoZS9zLXQwMTY4MTgwOC0yMzk0NDciLCJyb3dzX3RvX2Rpc2NhcmQiOjcsInNpemUiOjIxNDc0ODM2NDd9LCJsYWJlbHMiOnsiU3RhY2tlZERyZzMyR2lCVjEiOnsiX2giOm51bGwsImxhYmVscyI6W3siaWQiOiJsYXllci0xIiwicGF0aCI6Ii9GSUxFQ09JTi9TRUFML05VTUEwL2NhY2hlL3MtdDAxNjgxODA4LTIzOTQ0NyIsInJvd3NfdG9fZGlzY2FyZCI6Nywic2l6ZSI6MTA3Mzc0MTgyNH0seyJpZCI6ImxheWVyLTIiLCJwYXRoIjoiL0ZJTEVDT0lOL1NFQUwvTlVNQTAvY2FjaGUvcy10MDE2ODE4MDgtMjM5NDQ3Iiwicm93c190b19kaXNjYXJkIjo3LCJzaXplIjoxMDczNzQxODI0fSx7ImlkIjoibGF5ZXItMyIsInBhdGgiOiIvRklMRUNPSU4vU0VBTC9OVU1BMC9jYWNoZS9zLXQwMTY4MTgwOC0yMzk0NDciLCJyb3dzX3RvX2Rpc2NhcmQiOjcsInNpemUiOjEwNzM3NDE4MjR9LHsiaWQiOiJsYXllci00IiwicGF0aCI6Ii9GSUxFQ09JTi9TRUFML05VTUEwL2NhY2hlL3MtdDAxNjgxODA4LTIzOTQ0NyIsInJvd3NfdG9fZGlzY2FyZCI6Nywic2l6ZSI6MTA3Mzc0MTgyNH0seyJpZCI6ImxheWVyLTUiLCJwYXRoIjoiL0ZJTEVDT0lOL1NFQUwvTlVNQTAvY2FjaGUvcy10MDE2ODE4MDgtMjM5NDQ3Iiwicm93c190b19kaXNjYXJkIjo3LCJzaXplIjoxMDczNzQxODI0fSx7ImlkIjoibGF5ZXItNiIsInBhdGgiOiIvRklMRUNPSU4vU0VBTC9OVU1BMC9jYWNoZS9zLXQwMTY4MTgwOC0yMzk0NDciLCJyb3dzX3RvX2Rpc2NhcmQiOjcsInNpemUiOjEwNzM3NDE4MjR9LHsiaWQiOiJsYXllci03IiwicGF0aCI6Ii9GSUxFQ09JTi9TRUFML05VTUEwL2NhY2hlL3MtdDAxNjgxODA4LTIzOTQ0NyIsInJvd3NfdG9fZGlzY2FyZCI6Nywic2l6ZSI6MTA3Mzc0MTgyNH0seyJpZCI6ImxheWVyLTgiLCJwYXRoIjoiL0ZJTEVDT0lOL1NFQUwvTlVNQTAvY2FjaGUvcy10MDE2ODE4MDgtMjM5NDQ3Iiwicm93c190b19kaXNjYXJkIjo3LCJzaXplIjoxMDczNzQxODI0fSx7ImlkIjoibGF5ZXItOSIsInBhdGgiOiIvRklMRUNPSU4vU0VBTC9OVU1BMC9jYWNoZS9zLXQwMTY4MTgwOC0yMzk0NDciLCJyb3dzX3RvX2Rpc2NhcmQiOjcsInNpemUiOjEwNzM3NDE4MjR9LHsiaWQiOiJsYXllci0xMCIsInBhdGgiOiIvRklMRUNPSU4vU0VBTC9OVU1BMC9jYWNoZS9zLXQwMTY4MTgwOC0yMzk0NDciLCJyb3dzX3RvX2Rpc2NhcmQiOjcsInNpemUiOjEwNzM3NDE4MjR9LHsiaWQiOiJsYXllci0xMSIsInBhdGgiOiIvRklMRUNPSU4vU0VBTC9OVU1BMC9jYWNoZS9zLXQwMTY4MTgwOC0yMzk0NDciLCJyb3dzX3RvX2Rpc2NhcmQiOjcsInNpemUiOjEwNzM3NDE4MjR9XX19LCJyZWdpc3RlcmVkX3Byb29mIjoiU3RhY2tlZERyZzMyR2lCVjFfMSJ9"}}
4.      2022-12-27 13:22:38 +0000 UTC:  [event;sealing.SectorPreCommit2]        {"User":{"Sealed":{"/":"bagboea4b5abcby2v54vf3h2mxa3icvnm5fykqteyoryku2rsxe5waj52vi3qyzyc"},"Unsealed":{"/":"baga6ea4seaqao7s73y24kcutaosvacpdjgfe5pw76ooefnyqw4ynr3d2y6x2mpq"}}}
5.      2022-12-27 13:23:49 +0000 UTC:  [event;sealing.SectorPreCommitted]      {"User":{"Message":{"/":"bafy2bzacea5dsv3ihtagncszijogtizqaciffdgpqwmyp7t3ugbvkbylkcgsm"},"PreCommitDeposit":"95652330715309551","PreCommitInfo":{"SealProof":8,"SectorNumber":239447,"SealedCID":{"/":"bagboea4b5abcby2v54vf3h2mxa3icvnm5fykqteyoryku2rsxe5waj52vi3qyzyc"},"SealRandEpoch":2460080,"DealIDs":[],"Expiration":4010805,"UnsealedCid":null}}}
6.      2022-12-27 13:28:00 +0000 UTC:  [event;sealing.SectorPreCommitLanded]   {"User":{"TipSet":[{"/":"bafy2bzacec4pvkvnaukwkmwkqhdo3yq2pywsvy4zt2ir62qfkooda4qje2osk"},{"/":"bafy2bzacebmbplsq2lthtrbd2uxf577p2vyznbhsaktft2k2kxchhzamhg3vg"}]}}
7.      2022-12-27 14:43:00 +0000 UTC:  [event;sealing.SectorSeedReady] {"User":{"SeedValue":"VGsBH9ILkSwFvMy7pOt5hUmvAKqgTl+HM8TX53tVFwo=","SeedEpoch":2461520}}
8.      2022-12-27 14:57:22 +0000 UTC:  [event;sealing.SectorCommitted] {"User":{"Proof":"kfLojlhHMxFmu4g9QFu7Uf7YNiKTbEc44UNBa6q7OGJw53tNgblnCyotgcXQMeWMhR601q/gZi3gLqMMF+I6Hgr3d0Dv+J6SKYB6v0OUg1X4tgfis0ZO3w8BMJaXuGeGASxyHXwxoccP1b/uLwHD4wLLYjadL8KcL0HLVhKf5a6+PivhgSTkHvLJcSxjHodah2dKeYGlAiYa8bTCqR160WNb2iCq+9NxPcgRX2XeVC6BL4LS3NrJ9reslEfslfvahBFZ6f+w/Uv1OMdYQV2JyGIScRiO2bUz6Wx4uR8UZsLiJxv+DHtE0Cojfdc9OiR/laKrJsC9ws6wOwGghk4n6LPmu8+wFTj3SdTPl1NGVoIfaGgmr3SgjZjRMnkJCyGtFmrQKAf3foyjLuVY24BIJW474Od42B+qRV/Rxq3v1oRjnnZ1vvXh/l8xy0+EoqIplVvKaJSWxZSI/ic6jK+8BZuy7qmKMQPVATCTmq/6nF3fjqjpbvmMN3u7ruYy65dRtzgMbnBwTWn/zjnZC/rqIOoUjWSjEHtj55T+wgkTxjq0Aqfh9k/Ig4Dq2w3Jrii9kfBupCCUWGhKQgOmmwTTEXZoMJQvrN+TRIkaX6iYoW2LyGvk9HH/ONdugtpekr/DBIZWzL7/VZAAmYW0tUi7ZaOYCkgIZOq3tk1HXBqOt1INmUbdnfgAnKCFXRck/4kUqjuN4dSGhmV0iT30Gr5kKlEk4uYBCrZQF1404WJL5U1XU9NhZQCYS3uNXS67cZa7tuA18Cc5qO6CSeLls2gtnwI/FuH+EDhbSdUeNd+hMcD/lUDNPb5UHXl7A+LJxUMquMHRSG+MK5p9Kb1lR1XLAjgsit6Hai0lwfDRpq6pjTkdZArFUNLmCJeqzqkmA/BAF3X/gOK7Ell1TfL+nrrbrlHQP0vns9auo17ykyBC06/ZhLcvHo7Vz5d6yNL6aMX6gkxA16SjG5GjObRu7vwk+W/LGA4r3feaRwWJV0rZyDpUKhiBViM1xZUJsGEXf0ESmUlbd9OJRY+22M+myv07OtDZzztej1PvPsbW5Kec0DXDePjuwElaYBf9Ri6VROn3gJNqftFJU6HXIVgyLhi85gpMmLsHZhk/bwDlckY4BwJ1qPv44ldYLmFjPYvA12D2BMz2Hcm3X6Ot4JGuvdaz4S5nHjtJCskczmVuYybpaNWg26jaamW4lWzmNUAtspjfsEA/z745xicZmebYzK3HuE1FHNY5aeEnwm/Hd3KliDq//3rYjSiG5NbwozTHI5kFsqcf6UmfiPLaW7HGq+fochF4mxZH5aEx1wnSNttNf6kIdQUEq6bfi858r52Ty80grdtwUn+Ox/fEm3SEiawv55+a/CvCnM9hCN9s+/M3s90RxP6hSG3EnVmMcSw0wv2/FNbIpZl3QcUDPHzsK7w+DgfGoqfa9nRXLtJYg4iS+8msWST0GLbCwMngqYtu9YRAlFMR7xL+eKFmm5gWlWHwQf1dCPlCO2QbM5GV0h03oZg3GxZLRDoEyj+DRQiiFlYdtGQ92OR7x93ScGSTPhLuCMJAdfAM8UbSJv46Uxk8gLx4YPEJ5HgafyCKjypUCLs3quDEbOfScxecenRG6RGcf7FDvdGahYBO/8pQ1E7oPDK7747+qUhM3CjdUZSM2xJjFsrCDCeTtb7XiKSc9bMYuhdBQ5jJ1MMJPdh6eoo8irgh1gJPzZ0dnJWlPs3dktq4rc6DSFqaVbmwbo1TOI64kwZO4iD6MZi8iovE3VKBny5ZVJOekEUPe2FmYO+HSMOmo4RSzT3zae1LGvEOQ705LEy0HONl57ADnen9Yq4vz8xVg5O4/uOjtDMJNv5aRYENln0F/F0t2JFDzm6R4912Gob0NK/ypvcdh3K4IvR4ziTRVy3N11C4VyhPSulhuR+4C8kpyMMLRgcQMN7zPNh66RyhjWATS9W2/3MmJO6fLbB73yBp5FIP+wwjXpl4Mr5Slhtgef5XiIBvjE6vKkf6IvDB5cz8RQyg+ooI97qxqkayv0rN+yZV0R0jecYmNslqiYaUUjuQjMTwtpYvtkWTxxOch+MWJNyMRFwohqDMSxXsk1ztcLfl/nIRQk9Z/UIKtPKDxL/AIABORKcYZXpOBnXFSig9j3mwvE6mk1pvZoW2ZQasXEkNTonLVO0ldTZ1F9ODTAZ8Dp1YtoWdHS4kGAuAd+qMH8D/qtBRbKUxBIAIivAKI4aejee0Hmp9wDhFgAhVgsYrRxa+IZ4/nQ2qsuHwZ72q9z5O54pgc1NXDEfEeV/gUi0Q1K5c94msHE7Cg6jbzWTMHfwGeXtAJjJLSRxhIfpBjaKWOV6R19XKfEMlQRMisA/xsHDjY4FRTgC2gnWRHvGYHzEDib/kx2V7dK6vfcFCP+IjYjLD0uhpKn5dYBSxLD41naCN+A4fOxqGAHySd1xyoGL3z8ucpbJ7SirMW1zLVT2UR2uR6npyuyN/Cb6Rymlv9rTLVh9bzQ4SilUAOoTKdbIK3+FLZVzofRj5Vfbho23Uc2TFhkYXFPnSvS2lwK+pxVWMMF3Xupnh"}}
9.      2022-12-27 14:57:32 +0000 UTC:  [event;sealing.SectorCommitSubmitted]   {"User":{"Message":{"/":"bafy2bzacedvikfdh4kmxwobez5pt3eha5b4o62xlji3yp6htlhzeb5kfluoqu"}}}
10.     2022-12-27 15:04:26 +0000 UTC:  [event;sealing.SectorProving]   {"User":{}}

$ lotus-miner sectors list --fast --seal-time --events

ID      State           OnChain  Active  SealTime   Events  Deals
...
239447  FinalizeSector  YES      NO      4h55m6s    11      CC
...

$ lotus-miner sealing sched-diag

{
  "CallToWork": {},
  "EarlyRet": null,
  "ReturnedWork": null,
  "SchedInfo": {
    "OpenWindows": [
      "f0b8e24e-1082-4824-841c-590674588ed4",
      ...
      "fcc3b1bb-000f-4101-bc6b-905ede4a9cda"
    ],
    "Requests": [
      {
        "Priority": 0,
        "SchedId": "b3ddbeb7-d56e-4736-b825-8b7ff1955654",
        "Sector": {
          "Miner": 1681808,
          "Number": 239447
        },
        "TaskType": "seal/v0/fetch"
      }
    ]
  },
  "Waiting": null
}

$ lotus-miner storage find 239447

In 5ba861fb-321b-42aa-8e67-5d7967709648 (Sealed, Cache)
        Sealing: true; Storage: false
        Local (/FILECOIN/SEAL/20/NUMA0)
        URL: http://10.10.211.10:2345/remote/sealed/s-t01681808-239447
        URL: http://10.10.211.20:3333/remote/sealed/s-t01681808-239447
        URL: http://10.10.211.20:3456/remote/sealed/s-t01681808-239447
        URL: http://10.10.211.11:4501/remote/sealed/s-t01681808-239447
        ...
        ...
        ...
        URL: http://10.10.211.11:5601/remote/sealed/s-t01681808-239447
        URL: http://10.10.211.10:8888/remote/sealed/s-t01681808-239447
        URL: http://10.10.211.10:9999/remote/sealed/s-t01681808-239447

AP Worker Info:

Worker version:  1.6.0
CLI version: lotus-worker version 1.19.0+mainnet+git.64059ca87

Session: fcc3b1bb-000f-4101-bc6b-905ede4a9cda
Enabled: true
Hostname: ap-10-10-211-20-3333
CPUs: 15; GPUs: []
RAM: 33.74 GiB/1.957 TiB; Swap: 0 B/0 B
Task types: FIN GET FRU C1 PR1 AP DC

70ad46e2-0906-4906-8de0-fd37ab1ff404:
        Weight: 10; Use: Store
        Local: /LONGTERM-100/POOL-00
5ba861fb-321b-42aa-8e67-5d7967709648:
        Weight: 10; Use: Seal 
        Local: /FILECOIN/SEAL/NUMA0

PC1 Worker Info:

Worker version:  1.6.0
CLI version: lotus-worker version 1.19.0+mainnet+git.64059ca87

Session: 3ba7aede-0751-44a6-bbe3-bfc25b55555a
Enabled: true
Hostname: pc1-10-10-211-20-3456
CPUs: 60; GPUs: []
RAM: 33.74 GiB/1.957 TiB; Swap: 0 B/0 B
Task types: FIN GET FRU C1 PC1 PR1

70ad46e2-0906-4906-8de0-fd37ab1ff404:
        Weight: 10; Use: Store
        Local: /LONGTERM-100/POOL-00
5ba861fb-321b-42aa-8e67-5d7967709648:
        Weight: 10; Use: Seal 
        Local: /FILECOIN/SEAL/NUMA0

PC2 Worker Info:

Worker version:  1.6.0
CLI version: lotus-worker version 1.18.0+mainnet+git.bd10bdf99

Session: ca2dbf3f-a72f-424b-a26d-c45a5cb9b53d
Enabled: true
Hostname: pc2-10-10-211-11-4501-0
CPUs: 64; GPUs: [NVIDIA A40]
RAM: 51.22 GiB/1.957 TiB; Swap: 0 B/0 B
Task types: FIN GET FRU C1 PC2 PR1

70ad46e2-0906-4906-8de0-fd37ab1ff404:
        Weight: 10; Use: Store
        Local: /LONGTERM-100/POOL-00
5ba861fb-321b-42aa-8e67-5d7967709648:
        Weight: 10; Use: Seal 
        Local: /FILECOIN/SEAL/NUMA0

C2 Worker Info:

Worker version:  1.6.0
CLI version: lotus-worker version 1.18.0+mainnet+git.bd10bdf99

Session: 13e0d49d-9862-4f2a-95ab-5566bd44c062
Enabled: true
Hostname: c2-10-10-211-11-5601-3
CPUs: 63; GPUs: [NVIDIA A40]
RAM: 51.21 GiB/1.957 TiB; Swap: 0 B/0 B
Task types: FIN GET FRU C1 C2 PR1

70ad46e2-0906-4906-8de0-fd37ab1ff404:
        Weight: 10; Use: Store
        Local: /LONGTERM-100/POOL-00
5ba861fb-321b-42aa-8e67-5d7967709648:
        Weight: 10; Use: Seal 
        Local: /FILECOIN/SEAL/NUMA0

WDP Worker Info:

Worker version:  1.6.0
CLI version: lotus-worker version 1.19.0+mainnet+git.64059ca87

Session: aae29a59-8526-4c74-912c-9bdd79402f9e
Enabled: true
Hostname: wdpost-10-10-211-10-9999-0
CPUs: 64; GPUs: [NVIDIA A40]
RAM: 237.4 GiB/1.957 TiB; Swap: 0 B/0 B
Task types: WDP 

70ad46e2-0906-4906-8de0-fd37ab1ff404:
        Weight: 10; Use: Store
        Local: /LONGTERM-100/POOL-00
5ba861fb-321b-42aa-8e67-5d7967709648:
        Weight: 10; Use: Seal 
        Local: /FILECOIN/SEAL/NUMA0

WNP Worker Info:

Worker version:  1.6.0
CLI version: lotus-worker version 1.19.0+mainnet+git.64059ca87

Session: e28f929d-1204-41b8-9518-79cc4aefc879
Enabled: true
Hostname: winpost-10-10-211-10-8888-0
CPUs: 64; GPUs: [NVIDIA A40]
RAM: 231.7 GiB/1.957 TiB; Swap: 0 B/0 B
Task types: WNP

70ad46e2-0906-4906-8de0-fd37ab1ff404:
        Weight: 10; Use: Store
        Local: /LONGTERM-100/POOL-00
5ba861fb-321b-42aa-8e67-5d7967709648:
        Weight: 10; Use: Seal 
        Local: /FILECOIN/SEAL/NUMA0

All workers have Local access to same locations:

Local: /LONGTERM-100/POOL-00 - Seal
Local: /FILECOIN/SEAL/NUMA0 - Store

Also tried:

- miner restarts
- workers restarts
- lotus-miner sectors update-state to different states to trigger sector for any actions
- disabling all workers and leaving __ ONLY __ one worker with subsequent miner restart
- disabling all workers and running new one with all kind of tasks enabled (simple lotus-worker run)
sedasdas commented 1 year ago

the same question

sedasdas commented 1 year ago

image

sedasdas commented 1 year ago

when FIN , this is the worker log image

rwxr-xr-x commented 1 year ago

I did lot of tests with miner and only ONE worker. Finalyze always stuck. Before 1.19 update everything works fine.

sedasdas commented 1 year ago

how to fix it

sedasdas commented 1 year ago

this is the mine log image

sedasdas commented 1 year ago

no any error log

hyunmoon commented 1 year ago

Direct finalizing from lotus-worker to a long term storage fails with 500 code. It was working when I was onboarding by snapping but it is not working when I'm onboarding by sealing.

  1. 2023-02-19 00:49:50 +0900 KST: [event;sealing.SectorFinalizeFailed] {"User":{}} finalize sector: moving sector to storage: storage call error 0: failed to acquire sector {xxxx xxxx} from remote (tried [{c3501e46-6ade-414a-81a3-3b0c569f803f [http://xxxx:1111/remote/cache/s-t0xxxx-xxxx] [http://xxxx:1111/remote] 10 true false false [] []}]): 1 error occurred:

    [name: xxxxx]: failed to acquire sector {xxxx xxxx} from remote (tried [{c3501e46-6ade-414a-81a3-3b0c569f803f [http://xxxx:1111/remote/cache/s-t0xxxx-xxxx] [http://xxxx:1111/remote] 10 true false false [] []}]): 1 error occurred:

hyunmoon commented 1 year ago

@magik6k Do you have any guess on what could have broken RemoteFinalize since v1.17.0?

hyunmoon commented 1 year ago

I restarted lotus-miner after changing DisallowRemoteFinalize to false and I can see sectors are going directly from workers to the long term storage. I also checked the sectors getting fully finalized to 'Proving' state so it is defintely working.

hyunmoon commented 1 year ago

My guess the meaning changed since v1.17.0 'Disallow remote finalize from lotus-miner' -> 'Disallow remote finalize from lotus-worker'. Setting false means 'Allow finalize on lotus-worker' so it makes sense.

rwxr-xr-x commented 1 year ago

I restarted lotus-miner after changing DisallowRemoteFinalize to false and I can see sectors are going directly from workers to the long term storage. I also checked the sectors getting fully finalized to 'Proving' state so it is defintely working.

why you decided that they are going directly from workers?

hyunmoon commented 1 year ago

I checked the traffic on the long term storage server using iftop Data was flowing directly from the workers