Closed avishniakov closed 1 month ago
@htahir1 , FYI - this is the GCP fix we need
[!IMPORTANT]
Review skipped
Auto reviews are disabled on this repository.
Please check the settings in the CodeRabbit UI or the
.coderabbit.yaml
file in this repository. To trigger a single review, invoke the@coderabbitai review
command.You can disable this status message by setting the
reviews.review_status
tofalse
in the CodeRabbit configuration file.
The recent updates enhance ZenML's logging capabilities by shifting from file-based to folder-based log handling. Key changes include the introduction of functions and properties to manage log folders, refactoring of existing methods, and new tests to verify logging behavior. This ensures more efficient log retrieval and storage within ZenML pipelines.
File(s) | Summary |
---|---|
src/zenml/logging/step_logging.py |
Added TemporaryDirectory import, refactored log handling to use folders, introduced new functions and constants. |
src/zenml/orchestrators/step_launcher.py |
Renamed prepare_logs_uri to prepare_logs_folder_uri in the launch method. |
src/zenml/zen_server/routers/steps_endpoints.py |
Refactored get_step_logs to use fetch_logs for log retrieval. |
tests/integration/functional/steps/test_logging.py |
Added new functions and tests for verifying log merging and deletion behavior. |
tests/integration/functional/zen_stores/test_zen_store.py |
Replaced artifact store functions with fetch_logs and prepare_logs_folder_uri in tests. |
sequenceDiagram
participant User
participant StepLogsStorage
participant ArtifactStore
participant StepLauncher
User->>StepLauncher: launch()
StepLauncher->>StepLogsStorage: prepare_logs_folder_uri()
StepLogsStorage->>ArtifactStore: create_log_folder()
StepLauncher->>StepLogsStorage: write()
StepLogsStorage->>ArtifactStore: save_logs()
User->>StepLogsStorage: fetch_logs()
StepLogsStorage->>ArtifactStore: retrieve_logs()
ArtifactStore->>User: return logs
In folders now our logs reside,
Where once in files they used to hide.
With merging, fetching, all anew,
ZenML logs, more clear to view.
A rabbit's joy in code so neat,
Efficiency, a coder's treat!
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?
@avishniakov Thanks!! This is an interesting fix. Can you maybe try to run a pipeline with some sort of tqdm progress bar (any keras example with .fit() will do)... this will also tell us if this branch fixes that problem?
Also, for a standard pipeline, can you make a table of how performance degrades over log size and perhaps we can leave it here for future reference?
@avishniakov Thanks!! This is an interesting fix. Can you maybe try to run a pipeline with some sort of tqdm progress bar (any keras example with .fit() will do)... this will also tell us if this branch fixes that problem?
Also, for a standard pipeline, can you make a table of how performance degrades over log size and perhaps we can leave it here for future reference?
The Keras logs looks like this. Some TQDM events are lost, but I would not care much. The speed is 23.67179584503174 in pipeline mode and 16.098801851272583 by just calling a function, taking into account that pipeline was using GCP artifact store - quite good, IMO.
Caching disabled explicitly for run_mnist.
Step run_mnist has started.
x_train shape:
(60000, 28, 28, 1)
60000
train samples
10000
test samples
Model: "sequential"
ββββββββββββββββββββββββββββββββββββββββ³ββββββββββββββββββββββββββββββ³ββββββββββββββββββ
β Layer (type) β Output Shape β Param # β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β conv2d (Conv2D) β (None, 26, 26, 32) β 320 β
ββββββββββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββΌββββββββββββββββββ€
β max_pooling2d (MaxPooling2D) β (None, 13, 13, 32) β 0 β
ββββββββββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββΌββββββββββββββββββ€
β conv2d_1 (Conv2D) β (None, 11, 11, 64) β 18,496 β
ββββββββββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββΌββββββββββββββββββ€
β max_pooling2d_1 (MaxPooling2D) β (None, 5, 5, 64) β 0 β
ββββββββββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββΌββββββββββββββββββ€
β flatten (Flatten) β (None, 1600) β 0 β
ββββββββββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββΌββββββββββββββββββ€
β dropout (Dropout) β (None, 1600) β 0 β
ββββββββββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββΌββββββββββββββββββ€
β dense (Dense) β (None, 10) β 16,010 β
ββββββββββββββββββββββββββββββββββββββββ΄ββββββββββββββββββββββββββββββ΄ββββββββββββββββββ
Total params: 34,826 (136.04 KB)
Trainable params: 34,826 (136.04 KB)
Non-trainable params: 0 (0.00 B)
Epoch 1/2
1/422 ββββββββββββββββββββ 5:24 772ms/step - accuracy: 0.0859 - loss: 2.3276
2/422 ββββββββββββββββββββ 23s 56ms/step - accuracy: 0.0859 - loss: 2.3292
4/422 ββββββββββββββββββββ 16s 39ms/step - accuracy: 0.1076 - loss: 2.3122
6/422 ββββββββββββββββββββ 15s 38ms/step - accuracy: 0.1235 - loss: 2.2968
9/422 ββββββββββββββββββββ 13s 32ms/step - accuracy: 0.1429 - loss: 2.2777
11/422 ββββββββββββββββββββ 12s 31ms/step - accuracy: 0.1557 - loss: 2.2646
13/422 ββββββββββββββββββββ 12s 32ms/step - accuracy: 0.1674 - loss: 2.2512
15/422 ββββββββββββββββββββ 12s 32ms/step - accuracy: 0.1774 - loss: 2.2381
19/422 ββββββββββββββββββββ 11s 28ms/step - accuracy: 0.2000 - loss: 2.2089
23/422 ββββββββββββββββββββ 10s 25ms/step - accuracy: 0.2256 - loss: 2.1744
27/422 ββββββββββββββββββββ 9s 24ms/step - accuracy: 0.2504 - loss: 2.1360
31/422 ββββββββββββββββββββ 8s 23ms/step - accuracy: 0.2746 - loss: 2.0941
35/422 ββββββββββββββββββββ 8s 22ms/step - accuracy: 0.2970 - loss: 2.0502
39/422 ββββββββββββββββββββ 7s 21ms/step - accuracy: 0.3179 - loss: 2.0051
43/422 ββββββββββββββββββββ 7s 20ms/step - accuracy: 0.3372 - loss: 1.9606
46/422 ββββββββββββββββββββ 7s 20ms/step - accuracy: 0.3508 - loss: 1.9283
50/422 ββββββββββββββββββββ 7s 20ms/step - accuracy: 0.3677 - loss: 1.8864
54/422 ββββββββββββββββββββ 7s 19ms/step - accuracy: 0.3836 - loss: 1.8460
57/422 ββββββββββββββββββββ 7s 20ms/step - accuracy: 0.3947 - loss: 1.8172
60/422 ββββββββββββββββββββ 7s 20ms/step - accuracy: 0.4054 - loss: 1.7894
63/422 ββββββββββββββββββββ 7s 20ms/step - accuracy: 0.4156 - loss: 1.7625
67/422 ββββββββββββββββββββ 6s 19ms/step - accuracy: 0.4287 - loss: 1.7278
71/422 ββββββββββββββββββββ 6s 19ms/step - accuracy: 0.4409 - loss: 1.6947
75/422 ββββββββββββββββββββ 6s 19ms/step - accuracy: 0.4525 - loss: 1.6631
79/422 ββββββββββββββββββββ 6s 18ms/step - accuracy: 0.4634 - loss: 1.6330
83/422 ββββββββββββββββββββ 6s 18ms/step - accuracy: 0.4738 - loss: 1.6042
86/422 ββββββββββββββββββββ 6s 18ms/step - accuracy: 0.4812 - loss: 1.5834
90/422 ββββββββββββββββββββ 5s 18ms/step - accuracy: 0.4907 - loss: 1.5567
94/422 ββββββββββββββββββββ 5s 18ms/step - accuracy: 0.4998 - loss: 1.5312
98/422 ββββββββββββββββββββ 5s 18ms/step - accuracy: 0.5084 - loss: 1.5067
102/422 ββββββββββββββββββββ 5s 18ms/step - accuracy: 0.5166 - loss: 1.4831
106/422 ββββββββββββββββββββ 5s 17ms/step - accuracy: 0.5244 - loss: 1.4606
110/422 ββββββββββββββββββββ 5s 17ms/step - accuracy: 0.5319 - loss: 1.4389
114/422 ββββββββββββββββββββ 5s 17ms/step - accuracy: 0.5391 - loss: 1.4182
118/422 ββββββββββββββββββββ 5s 17ms/step - accuracy: 0.5460 - loss: 1.3982
122/422 ββββββββββββββββββββ 5s 17ms/step - accuracy: 0.5526 - loss: 1.3790
126/422 ββββββββββββββββββββ 4s 17ms/step - accuracy: 0.5589 - loss: 1.3605
130/422 ββββββββββββββββββββ 4s 17ms/step - accuracy: 0.5651 - loss: 1.3427
134/422 ββββββββββββββββββββ 4s 17ms/step - accuracy: 0.5710 - loss: 1.3254
138/422 ββββββββββββββββββββ 4s 17ms/step - accuracy: 0.5766 - loss: 1.3087
142/422 ββββββββββββββββββββ 4s 17ms/step - accuracy: 0.5821 - loss: 1.2926
146/422 ββββββββββββββββββββ 4s 16ms/step - accuracy: 0.5874 - loss: 1.2769
150/422 ββββββββββββββββββββ 4s 16ms/step - accuracy: 0.5926 - loss: 1.2618
154/422 ββββββββββββββββββββ 4s 16ms/step - accuracy: 0.5975 - loss: 1.2471
158/422 ββββββββββββββββββββ 4s 16ms/step - accuracy: 0.6023 - loss: 1.2329
162/422 ββββββββββββββββββββ 4s 16ms/step - accuracy: 0.6070 - loss: 1.2191
166/422 ββββββββββββββββββββ 4s 16ms/step - accuracy: 0.6115 - loss: 1.2057
170/422 ββββββββββββββββββββ 4s 16ms/step - accuracy: 0.6159 - loss: 1.1928
174/422 ββββββββββββββββββββ 3s 16ms/step - accuracy: 0.6201 - loss: 1.1802
178/422 ββββββββββββββββββββ 3s 16ms/step - accuracy: 0.6242 - loss: 1.1680
182/422 ββββββββββββββββββββ 3s 16ms/step - accuracy: 0.6282 - loss: 1.1560
186/422 ββββββββββββββββββββ 3s 16ms/step - accuracy: 0.6321 - loss: 1.1445
190/422 ββββββββββββββββββββ 3s 16ms/step - accuracy: 0.6359 - loss: 1.1332
194/422 ββββββββββββββββββββ 3s 16ms/step - accuracy: 0.6395 - loss: 1.1222
198/422 ββββββββββββββββββββ 3s 16ms/step - accuracy: 0.6431 - loss: 1.1116
201/422 ββββββββββββββββββββ 3s 16ms/step - accuracy: 0.6457 - loss: 1.1038
204/422 ββββββββββββββββββββ 3s 16ms/step - accuracy: 0.6483 - loss: 1.0961
207/422 ββββββββββββββββββββ 3s 16ms/step - accuracy: 0.6508 - loss: 1.0886
211/422 ββββββββββββββββββββ 3s 16ms/step - accuracy: 0.6541 - loss: 1.0789
215/422 ββββββββββββββββββββ 3s 16ms/step - accuracy: 0.6573 - loss: 1.0693
219/422 ββββββββββββββββββββ 3s 16ms/step - accuracy: 0.6604 - loss: 1.0600
223/422 ββββββββββββββββββββ 3s 16ms/step - accuracy: 0.6635 - loss: 1.0509
227/422 ββββββββββββββββββββ 3s 16ms/step - accuracy: 0.6664 - loss: 1.0419
231/422 ββββββββββββββββββββ 2s 16ms/step - accuracy: 0.6693 - loss: 1.0332
235/422 ββββββββββββββββββββ 2s 16ms/step - accuracy: 0.6722 - loss: 1.0247
239/422 ββββββββββββββββββββ 2s 16ms/step - accuracy: 0.6749 - loss: 1.0164
243/422 ββββββββββββββββββββ 2s 16ms/step - accuracy: 0.6776 - loss: 1.0083
247/422 ββββββββββββββββββββ 2s 16ms/step - accuracy: 0.6803 - loss: 1.0003
251/422 ββββββββββββββββββββ 2s 16ms/step - accuracy: 0.6829 - loss: 0.9926
255/422 ββββββββββββββββββββ 2s 16ms/step - accuracy: 0.6854 - loss: 0.9849
259/422 ββββββββββββββββββββ 2s 15ms/step - accuracy: 0.6879 - loss: 0.9774
263/422 ββββββββββββββββββββ 2s 15ms/step - accuracy: 0.6903 - loss: 0.9701
267/422 ββββββββββββββββββββ 2s 15ms/step - accuracy: 0.6927 - loss: 0.9629
271/422 ββββββββββββββββββββ 2s 15ms/step - accuracy: 0.6950 - loss: 0.9559
275/422 ββββββββββββββββββββ 2s 15ms/step - accuracy: 0.6973 - loss: 0.9490
278/422 ββββββββββββββββββββ 2s 15ms/step - accuracy: 0.6990 - loss: 0.9439
282/422 ββββββββββββββββββββ 2s 15ms/step - accuracy: 0.7012 - loss: 0.9373
286/422 ββββββββββββββββββββ 2s 15ms/step - accuracy: 0.7033 - loss: 0.9307
290/422 ββββββββββββββββββββ 2s 15ms/step - accuracy: 0.7054 - loss: 0.9243
294/422 ββββββββββββββββββββ 1s 15ms/step - accuracy: 0.7075 - loss: 0.9181
298/422 ββββββββββββββββββββ 1s 15ms/step - accuracy: 0.7095 - loss: 0.9119
302/422 ββββββββββββββββββββ 1s 15ms/step - accuracy: 0.7115 - loss: 0.9058
306/422 ββββββββββββββββββββ 1s 15ms/step - accuracy: 0.7135 - loss: 0.8999
307/422 ββββββββββββββββββββ 1s 16ms/step - accuracy: 0.7140 - loss: 0.8984
310/422 ββββββββββββββββββββ 1s 16ms/step - accuracy: 0.7154 - loss: 0.8941
314/422 ββββββββββββββββββββ 1s 16ms/step - accuracy: 0.7173 - loss: 0.8884
318/422 ββββββββββββββββββββ 1s 16ms/step - accuracy: 0.7191 - loss: 0.8828
322/422 ββββββββββββββββββββ 1s 16ms/step - accuracy: 0.7209 - loss: 0.8772
326/422 ββββββββββββββββββββ 1s 16ms/step - accuracy: 0.7227 - loss: 0.8718
330/422 ββββββββββββββββββββ 1s 16ms/step - accuracy: 0.7245 - loss: 0.8665
334/422 ββββββββββββββββββββ 1s 16ms/step - accuracy: 0.7262 - loss: 0.8612
338/422 ββββββββββββββββββββ 1s 16ms/step - accuracy: 0.7279 - loss: 0.8561
342/422 ββββββββββββββββββββ 1s 16ms/step - accuracy: 0.7296 - loss: 0.8510
346/422 ββββββββββββββββββββ 1s 16ms/step - accuracy: 0.7312 - loss: 0.8460
350/422 ββββββββββββββββββββ 1s 16ms/step - accuracy: 0.7328 - loss: 0.8411
354/422 ββββββββββββββββββββ 1s 16ms/step - accuracy: 0.7344 - loss: 0.8363
358/422 ββββββββββββββββββββ 1s 16ms/step - accuracy: 0.7359 - loss: 0.8315
362/422 ββββββββββββββββββββ 0s 16ms/step - accuracy: 0.7375 - loss: 0.8269
365/422 ββββββββββββββββββββ 0s 16ms/step - accuracy: 0.7386 - loss: 0.8234
368/422 ββββββββββββββββββββ 0s 16ms/step - accuracy: 0.7397 - loss: 0.8200
372/422 ββββββββββββββββββββ 0s 16ms/step - accuracy: 0.7412 - loss: 0.8155
376/422 ββββββββββββββββββββ 0s 16ms/step - accuracy: 0.7426 - loss: 0.8111
380/422 ββββββββββββββββββββ 0s 16ms/step - accuracy: 0.7440 - loss: 0.8067
384/422 ββββββββββββββββββββ 0s 16ms/step - accuracy: 0.7454 - loss: 0.8024
388/422 ββββββββββββββββββββ 0s 16ms/step - accuracy: 0.7468 - loss: 0.7982
392/422 ββββββββββββββββββββ 0s 16ms/step - accuracy: 0.7482 - loss: 0.7940
395/422 ββββββββββββββββββββ 0s 16ms/step - accuracy: 0.7492 - loss: 0.7909
398/422 ββββββββββββββββββββ 0s 16ms/step - accuracy: 0.7502 - loss: 0.7879
402/422 ββββββββββββββββββββ 0s 16ms/step - accuracy: 0.7515 - loss: 0.7839
405/422 ββββββββββββββββββββ 0s 16ms/step - accuracy: 0.7525 - loss: 0.7809
409/422 ββββββββββββββββββββ 0s 16ms/step - accuracy: 0.7538 - loss: 0.7770
413/422 ββββββββββββββββββββ 0s 16ms/step - accuracy: 0.7550 - loss: 0.7731
417/422 ββββββββββββββββββββ 0s 16ms/step - accuracy: 0.7563 - loss: 0.7693
421/422 ββββββββββββββββββββ 0s 16ms/step - accuracy: 0.7575 - loss: 0.7655
422/422 ββββββββββββββββββββ 8s 16ms/step - accuracy: 0.7581 - loss: 0.7637 - val_accuracy: 0.9757 - val_loss: 0.0904
Epoch 2/2
1/422 ββββββββββββββββββββ 11s 27ms/step - accuracy: 0.9766 - loss: 0.0878
5/422 ββββββββββββββββββββ 6s 15ms/step - accuracy: 0.9614 - loss: 0.1319
9/422 ββββββββββββββββββββ 6s 15ms/step - accuracy: 0.9581 - loss: 0.1412
13/422 ββββββββββββββββββββ 6s 15ms/step - accuracy: 0.9563 - loss: 0.1433
17/422 ββββββββββββββββββββ 5s 15ms/step - accuracy: 0.9559 - loss: 0.1425
21/422 ββββββββββββββββββββ 5s 15ms/step - accuracy: 0.9559 - loss: 0.1417
25/422 ββββββββββββββββββββ 5s 15ms/step - accuracy: 0.9560 - loss: 0.1400
29/422 ββββββββββββββββββββ 5s 15ms/step - accuracy: 0.9564 - loss: 0.1383
33/422 ββββββββββββββββββββ 5s 15ms/step - accuracy: 0.9566 - loss: 0.1372
37/422 ββββββββββββββββββββ 5s 15ms/step - accuracy: 0.9567 - loss: 0.1363
41/422 ββββββββββββββββββββ 5s 15ms/step - accuracy: 0.9569 - loss: 0.1355
45/422 ββββββββββββββββββββ 5s 15ms/step - accuracy: 0.9571 - loss: 0.1349
49/422 ββββββββββββββββββββ 5s 15ms/step - accuracy: 0.9573 - loss: 0.1347
53/422 ββββββββββββββββββββ 5s 15ms/step - accuracy: 0.9573 - loss: 0.1346
56/422 ββββββββββββββββββββ 5s 15ms/step - accuracy: 0.9574 - loss: 0.1344
60/422 ββββββββββββββββββββ 5s 15ms/step - accuracy: 0.9575 - loss: 0.1342
64/422 ββββββββββββββββββββ 5s 15ms/step - accuracy: 0.9576 - loss: 0.1340
68/422 ββββββββββββββββββββ 5s 15ms/step - accuracy: 0.9576 - loss: 0.1339
71/422 ββββββββββββββββββββ 5s 15ms/step - accuracy: 0.9577 - loss: 0.1339
75/422 ββββββββββββββββββββ 5s 15ms/step - accuracy: 0.9577 - loss: 0.1338
77/422 ββββββββββββββββββββ 5s 15ms/step - accuracy: 0.9578 - loss: 0.1338
79/422 ββββββββββββββββββββ 5s 16ms/step - accuracy: 0.9578 - loss: 0.1337
83/422 ββββββββββββββββββββ 5s 16ms/step - accuracy: 0.9578 - loss: 0.1337
87/422 ββββββββββββββββββββ 5s 16ms/step - accuracy: 0.9579 - loss: 0.1336
91/422 ββββββββββββββββββββ 5s 16ms/step - accuracy: 0.9579 - loss: 0.1334
94/422 ββββββββββββββββββββ 5s 16ms/step - accuracy: 0.9580 - loss: 0.1333
98/422 ββββββββββββββββββββ 5s 16ms/step - accuracy: 0.9580 - loss: 0.1333
102/422 ββββββββββββββββββββ 4s 15ms/step - accuracy: 0.9581 - loss: 0.1332
106/422 ββββββββββββββββββββ 4s 15ms/step - accuracy: 0.9581 - loss: 0.1331
110/422 ββββββββββββββββββββ 4s 15ms/step - accuracy: 0.9582 - loss: 0.1330
114/422 ββββββββββββββββββββ 4s 15ms/step - accuracy: 0.9582 - loss: 0.1328
118/422 ββββββββββββββββββββ 4s 15ms/step - accuracy: 0.9583 - loss: 0.1327
122/422 ββββββββββββββββββββ 4s 15ms/step - accuracy: 0.9583 - loss: 0.1326
126/422 ββββββββββββββββββββ 4s 15ms/step - accuracy: 0.9584 - loss: 0.1324
130/422 ββββββββββββββββββββ 4s 15ms/step - accuracy: 0.9585 - loss: 0.1323
134/422 ββββββββββββββββββββ 4s 15ms/step - accuracy: 0.9585 - loss: 0.1322
138/422 ββββββββββββββββββββ 4s 15ms/step - accuracy: 0.9586 - loss: 0.1320
142/422 ββββββββββββββββββββ 4s 15ms/step - accuracy: 0.9587 - loss: 0.1319
146/422 ββββββββββββββββββββ 4s 15ms/step - accuracy: 0.9587 - loss: 0.1317
149/422 ββββββββββββββββββββ 4s 15ms/step - accuracy: 0.9588 - loss: 0.1316
152/422 ββββββββββββββββββββ 4s 15ms/step - accuracy: 0.9588 - loss: 0.1315
156/422 ββββββββββββββββββββ 4s 15ms/step - accuracy: 0.9588 - loss: 0.1314
160/422 ββββββββββββββββββββ 3s 15ms/step - accuracy: 0.9589 - loss: 0.1313
164/422 ββββββββββββββββββββ 3s 15ms/step - accuracy: 0.9589 - loss: 0.1312
168/422 ββββββββββββββββββββ 3s 15ms/step - accuracy: 0.9590 - loss: 0.1311
172/422 ββββββββββββββββββββ 3s 15ms/step - accuracy: 0.9590 - loss: 0.1310
176/422 ββββββββββββββββββββ 3s 15ms/step - accuracy: 0.9591 - loss: 0.1308
180/422 ββββββββββββββββββββ 3s 15ms/step - accuracy: 0.9591 - loss: 0.1307
184/422 ββββββββββββββββββββ 3s 15ms/step - accuracy: 0.9591 - loss: 0.1306
188/422 ββββββββββββββββββββ 3s 15ms/step - accuracy: 0.9592 - loss: 0.1305
192/422 ββββββββββββββββββββ 3s 15ms/step - accuracy: 0.9592 - loss: 0.1305
196/422 ββββββββββββββββββββ 3s 15ms/step - accuracy: 0.9592 - loss: 0.1304
200/422 ββββββββββββββββββββ 3s 15ms/step - accuracy: 0.9593 - loss: 0.1303
204/422 ββββββββββββββββββββ 3s 15ms/step - accuracy: 0.9593 - loss: 0.1302
206/422 ββββββββββββββββββββ 3s 15ms/step - accuracy: 0.9593 - loss: 0.1302
210/422 ββββββββββββββββββββ 3s 15ms/step - accuracy: 0.9593 - loss: 0.1301
214/422 ββββββββββββββββββββ 3s 15ms/step - accuracy: 0.9594 - loss: 0.1300
218/422 ββββββββββββββββββββ 3s 15ms/step - accuracy: 0.9594 - loss: 0.1299
222/422 ββββββββββββββββββββ 2s 15ms/step - accuracy: 0.9594 - loss: 0.1298
226/422 ββββββββββββββββββββ 2s 15ms/step - accuracy: 0.9594 - loss: 0.1298
230/422 ββββββββββββββββββββ 2s 15ms/step - accuracy: 0.9595 - loss: 0.1297
234/422 ββββββββββββββββββββ 2s 15ms/step - accuracy: 0.9595 - loss: 0.1296
238/422 ββββββββββββββββββββ 2s 15ms/step - accuracy: 0.9595 - loss: 0.1295
242/422 ββββββββββββββββββββ 2s 15ms/step - accuracy: 0.9595 - loss: 0.1294
246/422 ββββββββββββββββββββ 2s 15ms/step - accuracy: 0.9596 - loss: 0.1293
250/422 ββββββββββββββββββββ 2s 15ms/step - accuracy: 0.9596 - loss: 0.1293
254/422 ββββββββββββββββββββ 2s 15ms/step - accuracy: 0.9596 - loss: 0.1292
255/422 ββββββββββββββββββββ 2s 16ms/step - accuracy: 0.9596 - loss: 0.1292
259/422 ββββββββββββββββββββ 2s 16ms/step - accuracy: 0.9597 - loss: 0.1291
263/422 ββββββββββββββββββββ 2s 16ms/step - accuracy: 0.9597 - loss: 0.1290
267/422 ββββββββββββββββββββ 2s 16ms/step - accuracy: 0.9597 - loss: 0.1289
271/422 ββββββββββββββββββββ 2s 16ms/step - accuracy: 0.9597 - loss: 0.1288
275/422 ββββββββββββββββββββ 2s 16ms/step - accuracy: 0.9598 - loss: 0.1287
279/422 ββββββββββββββββββββ 2s 16ms/step - accuracy: 0.9598 - loss: 0.1287
283/422 ββββββββββββββββββββ 2s 16ms/step - accuracy: 0.9598 - loss: 0.1286
287/422 ββββββββββββββββββββ 2s 16ms/step - accuracy: 0.9599 - loss: 0.1285
289/422 ββββββββββββββββββββ 2s 16ms/step - accuracy: 0.9599 - loss: 0.1284
291/422 ββββββββββββββββββββ 2s 16ms/step - accuracy: 0.9599 - loss: 0.1284
294/422 ββββββββββββββββββββ 2s 16ms/step - accuracy: 0.9599 - loss: 0.1283
296/422 ββββββββββββββββββββ 2s 16ms/step - accuracy: 0.9599 - loss: 0.1283
299/422 ββββββββββββββββββββ 1s 16ms/step - accuracy: 0.9600 - loss: 0.1282
303/422 ββββββββββββββββββββ 1s 16ms/step - accuracy: 0.9600 - loss: 0.1281
307/422 ββββββββββββββββββββ 1s 16ms/step - accuracy: 0.9600 - loss: 0.1280
311/422 ββββββββββββββββββββ 1s 16ms/step - accuracy: 0.9601 - loss: 0.1279
314/422 ββββββββββββββββββββ 1s 16ms/step - accuracy: 0.9601 - loss: 0.1279
317/422 ββββββββββββββββββββ 1s 16ms/step - accuracy: 0.9601 - loss: 0.1278
321/422 ββββββββββββββββββββ 1s 16ms/step - accuracy: 0.9602 - loss: 0.1277
325/422 ββββββββββββββββββββ 1s 16ms/step - accuracy: 0.9602 - loss: 0.1276
329/422 ββββββββββββββββββββ 1s 16ms/step - accuracy: 0.9602 - loss: 0.1275
333/422 ββββββββββββββββββββ 1s 16ms/step - accuracy: 0.9603 - loss: 0.1274
337/422 ββββββββββββββββββββ 1s 16ms/step - accuracy: 0.9603 - loss: 0.1273
341/422 ββββββββββββββββββββ 1s 16ms/step - accuracy: 0.9603 - loss: 0.1272
345/422 ββββββββββββββββββββ 1s 16ms/step - accuracy: 0.9604 - loss: 0.1271
348/422 ββββββββββββββββββββ 1s 16ms/step - accuracy: 0.9604 - loss: 0.1270
352/422 ββββββββββββββββββββ 1s 16ms/step - accuracy: 0.9604 - loss: 0.1270
356/422 ββββββββββββββββββββ 1s 16ms/step - accuracy: 0.9605 - loss: 0.1269
360/422 ββββββββββββββββββββ 0s 16ms/step - accuracy: 0.9605 - loss: 0.1268
364/422 ββββββββββββββββββββ 0s 16ms/step - accuracy: 0.9605 - loss: 0.1267
368/422 ββββββββββββββββββββ 0s 16ms/step - accuracy: 0.9606 - loss: 0.1266
372/422 ββββββββββββββββββββ 0s 16ms/step - accuracy: 0.9606 - loss: 0.1265
376/422 ββββββββββββββββββββ 0s 16ms/step - accuracy: 0.9606 - loss: 0.1264
380/422 ββββββββββββββββββββ 0s 16ms/step - accuracy: 0.9607 - loss: 0.1263
384/422 ββββββββββββββββββββ 0s 16ms/step - accuracy: 0.9607 - loss: 0.1262
388/422 ββββββββββββββββββββ 0s 16ms/step - accuracy: 0.9607 - loss: 0.1261
392/422 ββββββββββββββββββββ 0s 16ms/step - accuracy: 0.9608 - loss: 0.1260
395/422 ββββββββββββββββββββ 0s 16ms/step - accuracy: 0.9608 - loss: 0.1259
397/422 ββββββββββββββββββββ 0s 16ms/step - accuracy: 0.9608 - loss: 0.1259
398/422 ββββββββββββββββββββ 0s 17ms/step - accuracy: 0.9608 - loss: 0.1259
399/422 ββββββββββββββββββββ 0s 17ms/step - accuracy: 0.9608 - loss: 0.1258
402/422 ββββββββββββββββββββ 0s 17ms/step - accuracy: 0.9608 - loss: 0.1258
406/422 ββββββββββββββββββββ 0s 17ms/step - accuracy: 0.9609 - loss: 0.1257
410/422 ββββββββββββββββββββ 0s 17ms/step - accuracy: 0.9609 - loss: 0.1256
414/422 ββββββββββββββββββββ 0s 17ms/step - accuracy: 0.9609 - loss: 0.1255
417/422 ββββββββββββββββββββ 0s 17ms/step - accuracy: 0.9610 - loss: 0.1254
421/422 ββββββββββββββββββββ 0s 17ms/step - accuracy: 0.9610 - loss: 0.1254
422/422 ββββββββββββββββββββ 7s 17ms/step - accuracy: 0.9610 - loss: 0.1253 - val_accuracy: 0.9848 - val_loss: 0.0589
Test loss:
0.057743266224861145
Test accuracy:
0.9812999963760376
Step run_mnist has finished in 17.550s.
Step run_mnist completed successfully.
@avishniakov Sounds good. Can we do it for more epochs even just to test it?
@htahir1 @strickvl added some numbers into the PR description, as requested. I can confirm that TQDM is resolved and overall performance for logging intense tasks is way better. For not logging heavy jobs without TQDM the improvement would be quite minimal though, but the GCP logging is fixed from now on.
@avishniakov thats fantastic !! Love it
@coderabbitai review
@schustmi @stefannica I quite heavily revisited fetching logic, used lighter ops, like size
, and also supported negative offsets. Please have a look
A few tests failed during the Install steps - will run all on release already.
Describe changes
I fixed how we store log files to overcome the limitations of some filesystems, where files are immutable, so appending to them is not working properly. Specifically, it can be a GCS filesystem. With the fix, we work with logs as follows:
Performance improvements
There was one issue identified with the previous logging flow: the flush wrapper was calling save to file every time, basically dumping every line. Flush happens on any call of logger, for instance (didn't check for prints). I changed that to respect the logic of the buffer we have and on the GCP stack, it shows a runtime of 1.959s for a pipeline running just 100 log calls versus 33.216s using the code from the develop branch.
If I run
time python3 run.py --feature-pipeline --training-pipeline --inference-pipeline --no-cache
on local with GCP artifact store it will be 1:02.79 total fordevelop
versus 52.673 total for this branch, so you can see that it is not that dramatic, but it gets worse the more logging you do, as the synthetic example showed.Some more numbers
For the pipeline with just one step running with GCP artifact store as follows:
For the Keras MNIST example (also connected to GCP):
For non-log-heavy payload (also with GCP):
Pre-requisites
Please ensure you have done the following:
develop
and the open PR is targetingdevelop
. If your branch wasn't based on develop read Contribution guide on rebasing branch to develop.Types of changes
Summary by CodeRabbit
New Features
Refactor
prepare_logs_uri
toprepare_logs_folder_uri
.Tests