Open Fmstrat opened 3 years ago
Well, that proved easy. The PR I submitted is working for me
Lines in the log doesn't strike me as a particularly interesting measure of completion. Is there a useful relationship between that and time through the plot? Even presuming that plots are progressing at a consistent rate from one to the next.
@altendky The timing does seem mostly linear and in testing fairly accurate. I'm sure this is why the Chia team chose this method in the official chia-blockchain-gui
Interesting. I just did a quick glance and it seems like my jobs tend to spend nearly half their time in phase 1 which is only 800 of 2600 or so lines. It seems like this would depend on the thread count specified. But, perhaps it is enough to be a useful addition.
My jobs generally spend half their time in phase 1 too.
About this approach in general: what happens when the drive is full or disappears and it just keeps writing "retrying" lines? Won't it eventually incorrectly approach 100%? I would be quite dismissive of this idea except for the fact that the official gui is apparently doing it....
I think it is reasonable to get a percentage this way, using the phases:
in job.py:
class Phase:
...
def get_percentage(self):
if self.major == 1:
return (self.minor - 1) / 7 * 0.45
elif self.major == 2:
return (self.minor - 1) / 7 * 0.15 + 0.45
elif self.major == 3:
return (self.minor - 1) / 6 * 0.35 + 0.60
elif self.major == 4:
return 0.95
return 0
example log file from my computer, many lines are skipped for brevity:
2021-05-16T02:53:01.697 chia.plotting.create_plots : [32mINFO [0m Creating 1 plots of size 32, pool public key: aa6a801dfb2df9306dde50f1fad5d71e05d2b095c2b7642a20658fdb9ceca379c3059b9022144aa31a49f108139ad207 farmer public key: af0b5de5e0810acb0a83d5caadf9626ae1c72a7491b1e1a7870a250f7fa0bc5e4ff7a2c7dd8edd58c7d80283468b5616[0m
2021-05-16T02:53:01.706 chia.plotting.create_plots : [32mINFO [0m Memo: aa6a801dfb2df9306dde50f1fad5d71e05d2b095c2b7642a20658fdb9ceca379c3059b9022144aa31a49f108139ad207af0b5de5e0810acb0a83d5caadf9626ae1c72a7491b1e1a7870a250f7fa0bc5e4ff7a2c7dd8edd58c7d80283468b56165c325f765816a6b1f17e95614a6c547d1decefeb2a113d27c0ae7c7420a0e016[0m
2021-05-16T02:53:01.706 chia.plotting.create_plots : [32mINFO [0m Starting plot 1/1[0m
Starting plotting progress into temporary dirs: E:\Chia\Plots\Temp and E:\Chia\Plots\Temp
ID: 2a2ae9c134e9ba1f344172c1fc758bbffd6fa573a40945ada01bcb9bc75c54d1
Plot size is: 32
Buffer size is: 4608MiB
Using 128 buckets
Using 2 threads of stripe size 65536
Starting phase 1/4: Forward Propagation into tmp files... Sun May 16 02:53:01 2021
Computing table 1
F1 complete, time: 287.472 seconds. CPU (68.05%) Sun May 16 02:57:49 2021
Computing table 2
...
Forward propagation table time: 1905.169 seconds. CPU (119.920%) Sun May 16 03:29:35 2021
Computing table 3
...
Forward propagation table time: 2567.561 seconds. CPU (99.850%) Sun May 16 04:12:22 2021
Computing table 4
...
Forward propagation table time: 3078.870 seconds. CPU (98.130%) Sun May 16 05:03:41 2021
Computing table 5
...
Forward propagation table time: 2923.353 seconds. CPU (100.560%) Sun May 16 05:52:25 2021
Computing table 6
...
Forward propagation table time: 2492.062 seconds. CPU (111.940%) Sun May 16 06:33:57 2021
Computing table 7
...
Forward propagation table time: 2043.967 seconds. CPU (118.820%) Sun May 16 07:08:01 2021
Time for phase 1 = 15299.114 seconds. CPU (106.040%) Sun May 16 07:08:01 2021
Starting phase 2/4: Backpropagation into tmp files... Sun May 16 07:08:01 2021
Backpropagating on table 7
scanned table 7
scanned time = 321.256 seconds. CPU (9.330%) Sun May 16 07:13:23 2021
sorting table 7
Backpropagating on table 6
scanned table 6
scanned time = 208.608 seconds. CPU (62.060%) Sun May 16 07:30:40 2021
sorting table 6
sort time = 793.295 seconds. CPU (63.270%) Sun May 16 07:43:54 2021
Backpropagating on table 5
scanned table 5
scanned time = 243.433 seconds. CPU (51.740%) Sun May 16 07:47:57 2021
sorting table 5
sort time = 779.925 seconds. CPU (62.280%) Sun May 16 08:00:57 2021
Backpropagating on table 4
scanned table 4
scanned time = 256.417 seconds. CPU (48.990%) Sun May 16 08:05:14 2021
sorting table 4
sort time = 763.123 seconds. CPU (62.100%) Sun May 16 08:17:57 2021
Backpropagating on table 3
scanned table 3
scanned time = 253.902 seconds. CPU (49.090%) Sun May 16 08:22:12 2021
sorting table 3
sort time = 773.777 seconds. CPU (61.430%) Sun May 16 08:35:06 2021
Backpropagating on table 2
scanned table 2
scanned time = 235.054 seconds. CPU (52.960%) Sun May 16 08:39:01 2021
sorting table 2
sort time = 750.129 seconds. CPU (62.420%) Sun May 16 08:51:31 2021
table 1 new size: 3425200882
Time for phase 2 = 6209.960 seconds. CPU (52.060%) Sun May 16 08:51:32 2021
Wrote: 268
Starting phase 3/4: Compression from tmp files into "E:\\Chia\\Plots\\Temp\\plot-k32-2021-05-16-02-53-2a2ae9c134e9ba1f344172c1fc758bbffd6fa573a40945ada01bcb9bc75c54d1.plot.2.tmp" ... Sun May 16 08:51:32 2021
Compressing tables 1 and 2
...
Total compress table time: 2007.046 seconds. CPU (58.210%) Sun May 16 09:24:59 2021
Compressing tables 2 and 3
...
Total compress table time: 2178.968 seconds. CPU (62.550%) Sun May 16 10:01:18 2021
Compressing tables 3 and 4
...
Total compress table time: 2147.227 seconds. CPU (63.660%) Sun May 16 10:37:05 2021
Compressing tables 4 and 5
...
Total compress table time: 2174.614 seconds. CPU (63.850%) Sun May 16 11:13:20 2021
Compressing tables 5 and 6
Total compress table time: 2240.648 seconds. CPU (64.680%) Sun May 16 11:50:41 2021
Compressing tables 6 and 7
Total compress table time: 2357.186 seconds. CPU (49.820%) Sun May 16 12:29:58 2021
Time for phase 3 = 13106.198 seconds. CPU (60.360%) Sun May 16 12:29:59 2021
Starting phase 4/4: Write Checkpoint tables into "E:\\Chia\\Plots\\Temp\\plot-k32-2021-05-16-02-53-2a2ae9c134e9ba1f344172c1fc758bbffd6fa573a40945ada01bcb9bc75c54d1.plot.2.tmp" ... Sun May 16 12:29:59 2021
Starting to write C1 and C3 tables
...
Finished writing C1 and C3 tables
Writing C2 table
Finished writing C2 table
Final table pointers:
P1: 0x10c
P2: 0x374851cd2
P3: 0x6b5fa750b
P4: 0x9fdca084e
P5: 0xd55d00a36
P6: 0x10d99a2b4b
P7: 0x14ea3a8cda
C1: 0x190a3aefda
C2: 0x190a5526c6
C3: 0x190a552776
Time for phase 4 = 1260.113 seconds. CPU (48.890%) Sun May 16 12:50:59 2021
Approximate working space used (without final file): 269.355 GiB
Final File size: 101.361 GiB
Total time = 35875.386 seconds. CPU (78.000%) Sun May 16 12:50:59 2021
Copied final file from "E:\\Chia\\Plots\\Temp\\plot-k32-2021-05-16-02-53-2a2ae9c134e9ba1f344172c1fc758bbffd6fa573a40945ada01bcb9bc75c54d1.plot.2.tmp" to "G:\\dst\\plot-k32-2021-05-16-02-53-2a2ae9c134e9ba1f344172c1fc758bbffd6fa573a40945ada01bcb9bc75c54d1.plot.2.tmp"
Copy time = 950.196 seconds. CPU (8.080%) Sun May 16 13:06:50 2021
Removed temp2 file "E:\\Chia\\Plots\\Temp\\plot-k32-2021-05-16-02-53-2a2ae9c134e9ba1f344172c1fc758bbffd6fa573a40945ada01bcb9bc75c54d1.plot.2.tmp"? 1
Renamed final file from "G:\\dst\\plot-k32-2021-05-16-02-53-2a2ae9c134e9ba1f344172c1fc758bbffd6fa573a40945ada01bcb9bc75c54d1.plot.2.tmp" to "G:\\dst\\plot-k32-2021-05-16-02-53-2a2ae9c134e9ba1f344172c1fc758bbffd6fa573a40945ada01bcb9bc75c54d1.plot"
2021-05-16T13:06:50.491 chia.plotting.create_plots : [32mINFO [0m Summary:[0m
2021-05-16T13:06:50.491 chia.plotting.create_plots : [32mINFO [0m Created a total of 1 new plots[0m
2021-05-16T13:06:50.491 chia.plotting.create_plots : [32mINFO [0m plot-k32-2021-05-16-02-53-2a2ae9c134e9ba1f344172c1fc758bbffd6fa573a40945ada01bcb9bc75c54d1.plot[0m
We can see that each minor phase in the same major phase will take approximately the same time, with only one exception, phase 1 table 1 takes only a fraction of time of other tables.
phase | name | time | percent | characteristic |
---|---|---|---|---|
1 | Forward propagation | 15299 | 43% | cpu bound |
2 | Backpropagating | 6209 | 17% | cpu bound (mostly) |
3 | Compression | 13106 | 37% | cpu bound |
4 | Write Checkpoint tables | 1260 | 4% | io bound ( reading from disk ) |
total | 35875 | 101% |
Note:
Progress is a hard thing. The only way to get it correct here, even in a steady state situation on a single computer, is to do detailed log timing analysis and use that as the reference. Remember, you have to account for the threads in stage one. I run a single thread on one system and ran 4 on another for awhile. Either that is a silly thing to do or it has a relevant effect on this problem.
I can't say I understand how this brings significant value, especially given the associated complexity. What is it that people will do with this information that they can't do similarly with the major:minor phase status? Is it mostly an interest in more fine grained status that updates more often? We could have major:minor:micro phases based on the buckets or whatever is the next layer in each phase.
Don't get me wrong, I have a basic interest in percentage progress feedback as well. I just don't understand how it helps beyond mental satisfaction or how we can make it not a big lie and also not a major investment of time.
I guess we can only make the percentage as close to linear relationship with time as we can get. Of course it is imposible. I guess showing a percentage is much easier to understand for people who don't know what phases mean and just want to know the "progess". I think just use a simple estimation is good enough for now.
I added it in my own fork of plotman, so people who use it don't have to learn what phases mean to learn progress of jobs: | # | plot id | k | tmp | dst | wall | phase | tmp | pid | stat | mem | user | sys | io |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | ad4d9060 | 32 | E:\Chia\Plots\Temp | G:\dst | 2:47 | 1:3 13% | 125G | 15300 | RUN | 5.1G | 1:14 | 0:02 | - | |
1 | 65902766 | 32 | E:\Chia\Plots\Temp | G:\dst | 5:33 | 1:6 32% | 177G | 9832 | RUN | 5.1G | 3:30 | 0:06 | - |
The Chia GUI app estimates percentages based on the number of lines in the log files. This has been detailed here: https://www.reddit.com/r/chia/comments/mx3iy1/gui_plotting_and_percent_done/
For a
u=128
plot for instance, it totals out at 2626 lines in the log file.With that in mind, you can calculate out the percent progress in bash:
Since it's python, the code to mirror this could be pulled from the GUI: https://github.com/Chia-Network/chia-blockchain-gui/blob/main/src/modules/plotQueue.ts#L11
Specifically (with
log
being the text of the log):Could we get this same calculation into
interactive
andstatus
modes?