SiaFoundation / renterd

A renter for Sia
https://sia.tech/software/renterd
MIT License
68 stars 20 forks source link

Divide by zero panic when running migrations #505

Closed mike76-dev closed 1 year ago

mike76-dev commented 1 year ago

Describe the bug

When checking my node in the morning, I found that it had crashed with a panic.


goroutine 84504 [running]:
go.sia.tech/renterd/worker.(*slabDownload).downloadSpeed(0x1400049a850?)
    .../renterd/worker/download.go:1001 +0x128
go.sia.tech/renterd/worker.(*slabDownload).downloadShards(0x140001fc690, {0x1016f5758?, 0x14004829110?}, 0x100ff934e?)
    .../renterd/worker/download.go:979 +0x434
go.sia.tech/renterd/worker.(*downloadManager).downloadSlab(0x1008d3498?, {0x1016f5758?, 0x140048290b0?}, {0x5e, 0xf7, 0xab, 0xa3, 0xe9, 0x6d, 0xaa, ...}, ...)
    .../renterd/worker/download.go:510 +0x134
go.sia.tech/renterd/worker.(*downloadManager).DownloadSlab.func1()
    .../renterd/worker/download.go:359 +0x64
created by go.sia.tech/renterd/worker.(*downloadManager).DownloadSlab
    .../renterd/worker/download.go:358 +0x328

Expected behaviour

No panic should happen.

Additional context

Looks like it started around 4am (see the logs attached) and continued through 6:30am when renterd crashed.

General Information

Are you running a fork of renterd? Yes, but that part of the code is untouched. Are you on the mainnet or on the testnet? Mainnet.

Renterd Config

Please provide us the following information:

Autopilot Config

    "contracts": {
        "set": "autopilot",
        "amount": 20,
        "allowance": "1000000000000000000000000000",
        "period": 1008,
        "renewWindow": 302,
        "download": 100000000000,
        "upload": 10000000000,
        "storage": 10000000000
    },
    "hosts": {
        "allowRedundantIPs": false,
        "maxDowntimeHours": 1440,
        "scoreOverrides": null
    },
    "wallet": {
        "defragThreshold": 1000
    }
}

Bus Config

    "minShards": 3,
    "totalShards": 9
}

{
    "hostBlockHeightLeeway": 6,
    "maxContractPrice": "1000000000000000000000000",
    "maxDownloadPrice": "1500000000000000000000000000",
    "maxRPCPrice": "1000000000000000000000",
    "maxStoragePrice": "115740740741",
    "maxUploadPrice": "500000000000000000000000000",
    "minAccountExpiry": 86400000000000,
    "minMaxCollateral": "10000000000000000000000000",
    "minMaxEphemeralAccountBalance": "1000000000000000000000000",
    "minPriceTableValidity": 300000000000
}

Contract Set Contracts

        "id": "fcid:3517f29ced696358ea479f44857f40ddc4b2c9f497965b71bb78ba2637a79b3f",
        "id": "fcid:8b005e711e9a3ce1d8e12e276fd4388c5018cc62ac6a42c39b31d7d6f4235e3a",
        "id": "fcid:509e6cb25e64219d7241a57ab3a6bd3cdb99c675276487929401bbc1a4f4bccf",
        "id": "fcid:73436b22d032e646e1c685c23b359848ed1d383b026c9e99ab1d03a6de6246cc",
        "id": "fcid:fff0a7b60c1d8b3ca942e7aa4b809bf78c06fb47906c76073be5c84d4fdff80a",
        "id": "fcid:a5bf7c9031b458d452b78d5f81b2c0fcb66e396da6e16ed1993ae387e81ef10e",
        "id": "fcid:1434a949ff0b9667204822b683f9a63d4d70a901760c4ac7ddc46ae2c0e3afe2",
        "id": "fcid:1b0be7c60d50256ef5b3a658d8ff94b99f8e9b2f2d0e334bae0351f399549a9d",
        "id": "fcid:8f220d70dbd53ff677859dd238fe4fd3356f74f1a719f2b834ae51e0d8c7e30b",
        "id": "fcid:62e8d0c73762140bf42cc99df21a3bf9daf7d98c81ee4b4cfd00581550499b97",
        "id": "fcid:924b6b2ae1bdc47efb59716a5f0619162ca683f85464ab9a42a74fbf997925ac",
        "id": "fcid:6cd320d32c4dc36efa0296c28d5353bdcf1e245b179ee892320713a972d8ca71",
        "id": "fcid:b349dc567e27be9de2eff63b2785df85c5abbcd1c6c356ac60b78dfde7555abb",
        "id": "fcid:2297f2f5e0f86b755b56d51a561afc05bec58c44d95db99e9d7e2f69da20351c",
        "id": "fcid:3b932c6005cdd6e7796bcbeb95751191ab0cf5f9ed7ff8c0ddf812b39058cfed",
        "id": "fcid:534f52908a2711e830fbc97cb103e5d770643556179e36af30166fa84bf4f717",
        "id": "fcid:d3737b6383e2a872a7763c32322889ca5e49e006bdbd5ad2393178d799b27f25",

Renterd Logs

If applicable, upload your renterd.log file or part of the file you think is relevant to the issue. renterd.log

mike76-dev commented 1 year ago

I found out that my Internet could have been down, that could well have been the reason. I think this is related: the Internet went down again a few minutes ago, the autopilot dropped the contracts, because it couldn't fetch the revisions from the hosts, and the logs began to be spammed in an infinite loop:

2023-07-28T09:28:28+02:00   ERROR   autopilot.migrator  autopilot/migrator.go:129   worker: failed to migrate slab 3/58, health: -0.16666666666666666, err: couldn't migrate slabs: not enough hosts to repair unhealthy shard to minimum redundancy, 2<3