filecoin-project / lotus

Reference implementation of the Filecoin protocol, written in Go
https://lotus.filecoin.io/
Other
2.83k stars 1.25k forks source link

Task scheduler ignores multiple AP workers when doing Snap Deals #8919

Open rwxr-xr-x opened 2 years ago

rwxr-xr-x commented 2 years ago

Checklist

Lotus component

Lotus Version

Daemon:  1.16.0+mainnet+git.01254ab32+api1.5.0
Local: lotus version 1.16.0+mainnet+git.01254ab32

Daemon:  1.16.0+mainnet+git.01254ab32+api1.5.0
Local: lotus-miner version 1.16.0+mainnet+git.01254ab3

Describe the Bug

Even if Lotus has more than one AP worker it ignores others and do all AP tasks only on one worker (when doing Snap Deals).

Looks like StorageDealStaged state doesn't trigger task scheduler. And only one AP task has beed finished scheduler starting to look for another deal ready for AP and assing it again to same worker (whitch becomes free a moment ago).

Issue is quite similar to https://github.com/filecoin-project/lotus/issues/8913

Logging Information

no logs
rwxr-xr-x commented 2 years ago

Daemon: 1.16.0+mainnet+git.01254ab32+api1.5.0 Local: lotus version 1.16.0+mainnet+git.01254ab32

Daemon: 1.16.0+mainnet+git.01254ab32+api1.5.0 Local: lotus-miner version 1.16.0+mainnet+git.01254ab3

magik6k commented 2 years ago

Do you have more than one sector worth of deals waiting to be added? If the sealing pipeline has multiple deals which can fit into one sector, it will pack them into one sector - but that needs to happen sequentially. (this avoids creating lots of mostly empty sectors when you get a burst of small deals)

The other thing which limits how many AddPiece operations will run in parallel is the MaxWaitDealsSectors config - what value do you have it set to?

rwxr-xr-x commented 2 years ago

All deals 28+ GiB. Importing offline *.cars. Miner-ask 25+ GiB. Assuming 1 file per Snap Deal. I see deals starts as expected. I see deal Publish right after the car file fully imported. No other waits.

Some options:

MaxWaitDealsSectors=1
MaxDealsPerPublishMsg=1
MaxSealingSectors=0
MaxSealingSectorsForDeals=0
MaxUpgradingSectors=0
MaxDealsPerPublishMsg = 1
WaitDealsDelay = "0h0m10s"
PublishMsgPeriod = "0h0m10s"

Currently no CC production. Only Snap Deals...

rwxr-xr-x commented 2 years ago

My Snap Deals pipeline is basically CC pipeline with PC1-to-RU and PC2/C2-to-PR2 workers converted.

As a CC production with PC1/PC2/C2 workers - AP works fine, multiple AP spreading between workers (pledging triggers AP fine). Same servers transformed to Snap Deals pipeline - and AP stucks.