Closed malventano closed 2 years ago
Updated to 1.5.1 and cleared all settings, starting clean.
cache_hit: false
on a large portion of plots.chia plots check
of previously troublesome ranges takes a long time to start challenges (with its process pegged at 100.0 during the delay of several minutes per 1k plots in the selected range to check), but does eventually begin, and completes without error.Can you try to delete ~/.chia/mainnet/cache/plot_manager.dat
, and restart the harvester? This will repopulate the plot location cache, but note that on the initial discovery of all your plots, this may take a while to complete. All following startups will use the cache and should not experience the same load.
I've tried that multiple times while troubleshooting here, including during the update to 1.5.1 where I cleared the ~/.chia/mainnet
folder (only imported blockchain and blockchain_wallet db's and re-added plot_directories to config). The subsequent startup does take a bit longer, and it recreates the cache file as expected, but all other symptoms remain identical.
Regardless of time farming, or even repeated chia plots check
operations, affected plots (most of them) still show as cache_hit: False
in logs and chia_harvester
continues running at 100-110 load while farming (never below 100.0).
Re: running a chia plots check
, even if I -g
it to filter down to a single plot, after it finishes loading plots, if that plot showed as a cache hit: False
, then the check operation will hang for several minutes before issuing challenges. If I filter only to plots that return cache_hit: True
then challenges will begin immediately after the plots are loaded.
我也有类似问题,自从使用了1.5.0新版本和部分1.5 FORK,收割机的CPU都变成了%100,以前小于 %20
这不是同一台机器。这是不同的机器。
可以看到Chia和2个Fork 占用很高。这三个均为1.5.0的版本,旧的版本占用很低。
我的每台机器plot数量4000+
我的系统每次启动起来是自动删除 C:\Users\Administrator.*文件夹。不存在提到的缓存问题。
我的系统每次启动起来是自动删除 C:\Users\Administrator.*文件夹。不存在提到的缓存问题。
It could still be a caching-related issue since it would create a new cache on the next startup (and the cache is then used while the harvester runs). Either way, we won't know unless we can figure out a way to tell what those pegged harvester threads are doing.
为什么新版本占用cpu过高?
Same problem, harvester CPU 100%, Debian, GUI. This happens not only when the number of plots is over 4k. I have several hard drives with rafts and the problem appears when attaching certain rafts to the harvester. Since I mixed plotting using "Chia Proof of space v1.0", "Bladebit also an early version" and "MadMAX plotter", I have a hunch that this is due to some particular plotter, although I may be wrong.
All of my plots should be BladeBit replots (for pooling), so that shouldn't be the issue.
4078 bladebit plots, same hang when trying to check plots even with -n 5 -- waited several hours, nothing happened. broken up to directories of ~100 plots if relevant. 1.5.1.dev
@malventano Can you please provide some logs with DEBUG
log level?
Also:
still see cache_hit: false on a large portion of plots.
Not sure if i understand how you mean this. Do you see this on each refresh event for the same plots while the harvester is running or is it only after start and for random plots or whats going on there exactly?
@malventano Can you please provide some logs with
DEBUG
log level?
I poked around DEBUG level logs but nothing obvious stood out. Which activities would you like DEBUG logs for? Harvester startup with/without plot_manager.dat present? While attempting a plots check?
Not sure if i understand how you mean this. Do you see this on each refresh event for the same plots while the harvester is running or is it only after start and for random plots or whats going on there exactly?
I see these events in log during harvester start and when attempting to do a plots check. Example:
2022-09-14T17:55:08.360 chia.plotting.cache : INFO Loaded 28925466 bytes of cached data
2022-09-14T17:55:08.669 chia.plotting.check_plots : INFO event: started, loaded 0 plots, 24286 remaining
2022-09-14T17:55:08.772 chia.plotting.manager : INFO Only loading plots that contain "/mnt/e" in the file or directory name
2022-09-14T17:55:08.784 chia.plotting.check_plots : INFO event: batch_processed, loaded 0 plots, 23986 remaining
2022-09-14T17:55:08.784 chia.plotting.manager : INFO Only loading plots that contain "/mnt/e" in the file or directory name
(...)
2022-09-14T17:55:09.104 chia.plotting.check_plots : INFO event: batch_processed, loaded 0 plots, 13486 remaining
2022-09-14T17:55:09.104 chia.plotting.manager : INFO Only loading plots that contain "/mnt/e" in the file or directory name
2022-09-14T17:55:09.201 chia.plotting.manager : INFO Found plot /mnt/e01/plot-k32-2021-08-20-06-17-008935afe5dc515551913458dc6690be94f7b2976db134928a558722
52213646.plot of size 32, cache_hit: False
2022-09-14T17:55:09.220 chia.plotting.manager : INFO Found plot /mnt/e01/plot-k32-2021-08-26-19-39-592674758f32cef423d4d689d6130601ba9b4dda876addc57b6aa8d3
8851c611.plot of size 32, cache_hit: False
2022-09-14T17:55:09.225 chia.plotting.manager : INFO Found plot /mnt/e01/plot-k32-2021-08-26-20-51-f2ffde52daa51eaa90925aec44ad80280b4e6072549788b778d6eda3
5fda08bb.plot of size 32, cache_hit: False
2022-09-14T17:55:09.230 chia.plotting.manager : INFO Found plot /mnt/e01/plot-k32-2021-08-20-07-10-f0e01eb38bbbace2205dbf92e6d3bc02b06d9038c1009219363ffc65
4c472eca.plot of size 32, cache_hit: False
2022-09-14T17:55:09.247 chia.plotting.manager : INFO Found plot /mnt/e01/plot-k32-2021-08-26-18-28-2fdd05f2542dad6ab1e431e9f75634ac626356b397c3ed967f6920b2
30c419a6.plot of size 32, cache_hit: False
2022-09-14T17:55:09.277 chia.plotting.manager : INFO Found plot /mnt/e01/plot-k32-2021-08-26-10-24-97b3a3db56174d60a62083d39f4be3450f1e6ecfeeb131e51b4cc4a5
576c2078.plot of size 32, cache_hit: False
...as compared with a check against a range that was cached properly:
2022-09-14T17:52:46.974 chia.plotting.cache : INFO Loaded 3597591420 bytes of cached data
2022-09-14T17:52:58.113 chia.plotting.check_plots : INFO event: started, loaded 0 plots, 24286 remaining
2022-09-14T17:52:58.222 chia.plotting.manager : INFO Only loading plots that contain "/mnt/d27" in the file or directory name
2022-09-14T17:52:58.233 chia.plotting.check_plots : INFO event: batch_processed, loaded 0 plots, 23986 remaining
2022-09-14T17:52:58.234 chia.plotting.manager : INFO Only loading plots that contain "/mnt/d27" in the file or directory name
(...)
2022-09-14T17:52:58.344 chia.plotting.check_plots : INFO event: batch_processed, loaded 0 plots, 19786 remaining
2022-09-14T17:52:58.344 chia.plotting.manager : INFO Only loading plots that contain "/mnt/d27" in the file or directory name
2022-09-14T17:52:58.349 chia.plotting.manager : INFO Found plot /mnt/d27/plot-k32-2021-08-08-05-46-ef9240da867fd01e48d817f6e1ddf0517b703754863a302f67422d82
42edcfea.plot of size 32, cache_hit: True
2022-09-14T17:52:58.349 chia.plotting.manager : INFO Found plot /mnt/d27/plot-k32-2021-08-08-01-16-78a37f51777b07f9d5f0d395cc0c987e79ce0581d6b14fdf5e4ffd4f
6abf9d02.plot of size 32, cache_hit: True
2022-09-14T17:52:58.350 chia.plotting.manager : INFO Found plot /mnt/d27/plot-k32-2021-08-08-05-56-8c22aa8c3118e76d1a6e4bba3cecb961b6872601eb40b1e0bd4f938d
ead70072.plot of size 32, cache_hit: True
2022-09-14T17:52:58.350 chia.plotting.manager : INFO Found plot /mnt/d27/plot-k32-2021-08-08-06-16-0cdab955ad3f372d27dd261f47c38f33e6a473e63c4b12c164b367d1
99bc4d59.plot of size 32, cache_hit: True
2022-09-14T17:52:58.350 chia.plotting.manager : INFO Found plot /mnt/d27/plot-k32-2021-08-08-06-26-647c3e28d82478748889b0484455f004cc04f97d52ed776156e4197c
61c17688.plot of size 32, cache_hit: True
2022-09-14T17:52:58.350 chia.plotting.manager : INFO Found plot /mnt/d27/plot-k32-2021-08-08-06-26-978e931b10552fba6285214ae7de3142e9f43d1551e27cbd1b78d171
145df96d.plot of size 32, cache_hit: True
They do not repeat during refresh events, but I did note these other errors appearing in INFO level logs at regular intervals - could be related:
2022-09-14T09:16:30.118 harvester chia.plotting.cache : ERROR Failed to save cache: Value 5794656522 does not fit into uint32, Traceback (most recent call last):
2022-09-14T09:18:32.225 harvester chia.harvester.harvester: INFO _plot_refresh_callback: event done, loaded 0, removed 0, processed 24286, remaining 0, duration: 1.02 seconds, total plots: 24286
2022-09-14T10:58:45.556 harvester chia.plotting.cache : ERROR Failed to save cache: Value 5794656522 does not fit into uint32, Traceback (most recent call last):
2022-09-14T11:00:47.660 harvester chia.harvester.harvester: INFO _plot_refresh_callback: event done, loaded 0, removed 0, processed 24286, remaining 0, duration: 0.88 seconds, total plots: 24286
2022-09-14T12:42:38.391 harvester chia.plotting.cache : ERROR Failed to save cache: Value 5794656522 does not fit into uint32, Traceback (most recent call last):
2022-09-14T12:44:40.528 harvester chia.harvester.harvester: INFO _plot_refresh_callback: event done, loaded 0, removed 0, processed 24286, remaining 0, duration: 0.99 seconds, total plots: 24286
2022-09-14T14:25:30.126 harvester chia.plotting.cache : ERROR Failed to save cache: Value 5794656522 does not fit into uint32, Traceback (most recent call last):
2022-09-14T14:27:32.312 harvester chia.harvester.harvester: INFO _plot_refresh_callback: event done, loaded 0, removed 0, processed 24286, remaining 0, duration: 0.98 seconds, total plots: 24286
2022-09-14T09:16:30.118 harvester chia.plotting.cache : ERROR Failed to save cache: Value 5794656522 does not fit into uint32, Traceback (most recent call last):
Can you also post the traceback which comes after this line ^
Can you also post the traceback which comes after this line ^
Oh, sorry, I was tailing into a grep for harvester for that. Here's the most recent one in my log:
2022-09-14T17:55:46.871 harvester chia.plotting.cache : ERROR Failed to save cache: Value 5794656522 does not fit into uint32, Traceback (most recent call last):
File "/root/chia-blockchain/chia/plotting/cache.py", line 121, in save
serialized: bytes = bytes(disk_cache)
File "/root/chia-blockchain/chia/util/streamable.py", line 616, in __bytes__
self.stream(f)
File "/root/chia-blockchain/chia/util/streamable.py", line 602, in stream
field.stream_function(getattr(self, field.name), f)
File "/root/chia-blockchain/chia/util/streamable.py", line 434, in stream_bytes
write_uint32(f, uint32(len(item)))
File "/root/chia-blockchain/chia/util/struct_stream.py", line 67, in __init__
raise ValueError(f"Value {self} does not fit into {type(self).__name__}")
ValueError: Value 5794656522 does not fit into uint32
Thanks, this helps 👍
Thanks, this helps 👍
Of course. Just remember I'm not sure if that error is related to the CPU use issue, but here's hoping! :)
Can you post the full debug.log either here or send it to me or on keybase http://keybase.io/dustinface?
Okay so.. turned out that the reason for all this are plots created via bladebit RAM plotter where the DiskProver
serializes into 524.659 bytes which:
uint32
-> Value 5794656522 does not fit into uint32
while we seralize the length of the bytes.The reason why the DiskProver
serializes into such a huge blob is that those plots seem to have 65.536 C2
entries.
Table pointers from a plot in question with table_begin_pointers[10] - table_begin_pointers[9]
-> 262.144:
table_begin_pointers = {std::vector<unsigned long long>} size=11
[0] = {unsigned long long} 0
[1] = {unsigned long long} 262144
[2] = {unsigned long long} 14839185408
[3] = {unsigned long long} 28822208512
[4] = {unsigned long long} 42911924224
[5] = {unsigned long long} 57272958976
[6] = {unsigned long long} 72367734784
[7] = {unsigned long long} 89824165888
[8] = {unsigned long long} 107538284544
[9] = {unsigned long long} 107540119552
[10] = {unsigned long long} 107540381696
Table pointers from a normally working plot with table_begin_pointers[10] - table_begin_pointers[9]
-> 176:
table_begin_pointers = {std::vector<unsigned long long>} size=11
[0] = {unsigned long long} 0
[1] = {unsigned long long} 252
[2] = {unsigned long long} 14839436976
[3] = {unsigned long long} 28822365051
[4] = {unsigned long long} 42911861451
[5] = {unsigned long long} 57273202401
[6] = {unsigned long long} 72368924901
[7] = {unsigned long long} 89827257426
[8] = {unsigned long long} 107543532882
[9] = {unsigned long long} 107545250830
[10] = {unsigned long long} 107545251006
Im going to talk with @harold-b about this and will post an update once we figured this out.
thank you! cant wait to test!
What happened?
Noted that for the last few releases, chia_harvester was pegging a thread continuously while farming.
Info:
Elaboration:
cache_hit: True
tocache_hit: False
.chia plots check
on any of the drives/plots withcache_hit: False
results in an indefinite hang of that check before it issues a single challenge.cache_hit: False
causes high harvester CPU usage and inability to check those plots, they are still successfully farming.Possible causes:
Version
1.5.0
What platform are you using?
Linux
What ui mode are you using?
CLI
Relevant log output
No response