filecoin-project / lotus

Reference implementation of the Filecoin protocol, written in Go
https://lotus.filecoin.io/
Other
2.82k stars 1.25k forks source link

lotus-miner sectors expired --remove-expired [...] shows success even if sector data isn't accessible - isn't being removed #8623

Closed f8-ptrk closed 2 weeks ago

f8-ptrk commented 2 years ago

Checklist

Lotus component

Lotus Version

Daemon:  1.14.2+mainnet+git.6347daf84+api1.4.0
Local: lotus-miner version 1.14.2+mainnet+git.6347daf84

Describe the Bug

when sector data isn't accessible and one tries to delete it with lotus-miner sectors expired --remove-expired [...] the command shows OK - even the data is not being removed.

sector gets delete from the miners DB but stays on disk

Logging Information

nothing seen in regard to that. let me know if there is something specific to look out for, we have sectors that expire daily for a few weeks now - we can run testing as needed on this issue.

Repo Steps

see attached file remoev-exired-ok-bug.txt

Reiers commented 2 years ago

Hi @f8-ptrk

This is rare case - why doesn't miner have access to the long term storage under operations?

Please provide miner logs, sector logs of the ones that are "deleted/removed".

Thanks!

f8-ptrk commented 2 years ago

the command run again doesn't show the sector as expired anymore (as it is removed from the miners DB)

filecoin/mainnet/storage/obelix2/tank/sectors/sealed/s-t062353-625"}
{"level":"info","ts":"2022-05-10T00:49:17.831Z","logger":"stores","caller":"stores/local.go:639","msg":"remove /opt/filecoin/mainnet/storage/obelix2/tank/sectors/sealed/s-t062353-627"}
{"level":"info","ts":"2022-05-10T00:49:17.835Z","logger":"stores","caller":"stores/local.go:639","msg":"remove /opt/filecoin/mainnet/storage/obelix2/tank/sectors/sealed/s-t062353-628"}
{"level":"info","ts":"2022-05-10T00:49:17.840Z","logger":"stores","caller":"stores/local.go:639","msg":"remove /opt/filecoin/mainnet/storage/obelix2/tank/sectors/sealed/s-t062353-629"}
{"level":"info","ts":"2022-05-10T00:49:17.840Z","logger":"stores","caller":"stores/local.go:639","msg":"remove /opt/filecoin/mainnet/storage/obelix2/tank/sectors/cache/s-t062353-598"}
{"level":"info","ts":"2022-05-10T00:49:17.844Z","logger":"stores","caller":"stores/local.go:639","msg":"remove /opt/filecoin/mainnet/storage/obelix2/tank/sectors/sealed/s-t062353-630"}
{"level":"info","ts":"2022-05-10T00:49:17.849Z","logger":"stores","caller":"stores/local.go:639","msg":"remove /opt/filecoin/mainnet/storage/obelix2/tank/sectors/sealed/s-t062353-633"}
{"level":"info","ts":"2022-05-10T00:49:17.854Z","logger":"stores","caller":"stores/local.go:639","msg":"remove /opt/filecoin/mainnet/storage/obelix2/tank/sectors/sealed/s-t062353-635"}
{"level":"info","ts":"2022-05-10T00:49:17.859Z","logger":"stores","caller":"stores/local.go:639","msg":"remove /opt/filecoin/mainnet/storage/obelix2/tank/sectors/sealed/s-t062353-637"}
{"level":"info","ts":"2022-05-10T00:49:17.864Z","logger":"stores","caller":"stores/local.go:639","msg":"remove /opt/filecoin/mainnet/storage/obelix2/tank/sectors/sealed/s-t062353-639"}
{"level":"info","ts":"2022-05-10T00:49:17.894Z","logger":"stores","caller":"stores/local.go:639","msg":"remove /opt/filecoin/mainnet/storage/obelix2/tank/sectors/cache/s-t062353-535"}
{"level":"error","ts":"2022-05-10T00:49:17.895Z","logger":"stores","caller":"stores/local.go:642","msg":"removing sector ({62353 535}) from /opt/filecoin/mainnet/storage/obelix2/tank/sectors/cache/s-t062353-535: fstatat /opt/filecoin/mainnet/storage/obelix2/tank/sectors/cache/s-t062353-535/sc-02-data-tree-r-last-1.dat: permission denied"}
{"level":"info","ts":"2022-05-10T00:49:17.992Z","logger":"stores","caller":"stores/local.go:639","msg":"remove /opt/filecoin/mainnet/storage/obelix2/tank/sectors/cache/s-t062353-610"}
{"level":"info","ts":"2022-05-10T00:49:18.019Z","logger":"stores","caller":"stores/local.go:639","msg":"remove /opt/filecoin/mainnet/storage/obelix2/tank/sectors/cache/s-t062353-600"}
{"level":"info","ts":"2022-05-10T00:49:18.174Z","logger":"stores","caller":"stores/local.go:639","msg":"remove /opt/filecoin/mainnet/storage/obelix2/tank/sectors/cache/s-t062353-601"}
{"level":"info","ts":"2022-05-10T00:49:18.428Z","logger":"net/identify","caller":"identify/id.go:372","msg":"failed negotiate identify protocol with peer","peer":"12D3KooWEDiYJuxpugxUxzPB8Dibw9kGc6PDArbvYrijefYQCb8c","error":"stream reset"}
{"level":"info","ts":"2022-05-10T00:49:18.501Z","logger":"stores","caller":"stores/local.go:639","msg":"remove /opt/filecoin/mainnet/storage/obelix2/tank/sectors/cache/s-t062353-602"}
{"level":"info","ts":"2022-05-10T00:49:18.589Z","logger":"stores","caller":"stores/local.go:639","msg":"remove /opt/filecoin/mainnet/storage/obelix2/tank/sectors/cache/s-t062353-603"}
{"level":"info","ts":"2022-05-10T00:49:18.683Z","logger":"stores","caller":"stores/local.go:639","msg":"remove /opt/filecoin/mainnet/storage/obelix2/tank/sectors/cache/s-t062353-604"}
{"level":"info","ts":"2022-05-10T00:49:18.930Z","logger":"stores","caller":"stores/local.go:639","msg":"remove /opt/filecoin/mainnet/storage/obelix2/tank/sectors/cache/s-t062353-605"}
{"level":"info","ts":"2022-05-10T00:49:18.961Z","logger":"stores","caller":"stores/local.go:639","msg":"remove /opt/filecoin/mainnet/storage/obelix2/tank/sectors/cache/s-t062353-606"}
{"level":"info","ts":"2022-05-10T00:49:18.994Z","logger":"stores","caller":"stores/local.go:639","msg":"remove /opt/

if that helps - we know that access is not possible, the logs confirm that. still the command shows OK in its output

f8-ptrk commented 2 years ago

see the attached fiel for what exactly the miner commands show - yes, the sector disappears from the expired commands output and doesn't get listed again for removal.

i know it's a rare case. we combined 2 storage locations that got utilized by different miners and they were using different UIDs - thus the access problems

f8-ptrk commented 2 years ago

in the end the commands output is wrong. thats all. nothing critical

Reiers commented 2 years ago

Got it ! Thanks for flagging - (and saving me from repro), I think this is enough information to work with. I will assign it to the right team for analysis.

f8-ptrk commented 2 years ago

i can repo that for the next 7 weeks to come if needed as we have a "dying" miner under control right now that has like 32-48 sectors expiring every day

just lemme know