zenml-io / zenml

ZenML πŸ™: Build portable, production-ready MLOps pipelines. https://zenml.io.
https://zenml.io
Apache License 2.0
3.83k stars 420 forks source link

Logging performance improvements and GCP logging fix #2755

Closed avishniakov closed 1 month ago

avishniakov commented 1 month ago

Describe changes

I fixed how we store log files to overcome the limitations of some filesystems, where files are immutable, so appending to them is not working properly. Specifically, it can be a GCS filesystem. With the fix, we work with logs as follows:

Performance improvements

There was one issue identified with the previous logging flow: the flush wrapper was calling save to file every time, basically dumping every line. Flush happens on any call of logger, for instance (didn't check for prints). I changed that to respect the logic of the buffer we have and on the GCP stack, it shows a runtime of 1.959s for a pipeline running just 100 log calls versus 33.216s using the code from the develop branch.

def step_1():
    for i in range(50):
        logger.info(f"step 1 - {i}")

If I run time python3 run.py --feature-pipeline --training-pipeline --inference-pipeline --no-cache on local with GCP artifact store it will be 1:02.79 total for develop versus 52.673 total for this branch, so you can see that it is not that dramatic, but it gets worse the more logging you do, as the synthetic example showed.

Some more numbers

For the pipeline with just one step running with GCP artifact store as follows:

@step(enable_cache=False)
def step_1(log_steps:int):
    for i in range(log_steps):
        logger.info(f"step 1 - {i}")
Events This branch Develop (vs this)
50 1.92 42.28 (+2102%)
100 1.84 68.77 (+3637%)
500 3.2 359.59 (+11119%)
1000 4.59 712.43 (+15412%)
2000 7.83 1397.04 (+17742%)

For the Keras MNIST example (also connected to GCP):

Epochs This branch Develop (vs this) Function directly (vs this)
2 23.67 607.96 (+2468%) 16.09 (-32%)
20 153.80 NA 146.39 (-5%)

For non-log-heavy payload (also with GCP):

Example This Branch Develop (vs this)
Quickstart (GCP artifact store) 52.67 62.79 (+19%)
E2E (Local artifact store) 38.68 38.06 (-2%)

Pre-requisites

Please ensure you have done the following:

Types of changes

Summary by CodeRabbit

avishniakov commented 1 month ago

@htahir1 , FYI - this is the GCP fix we need

coderabbitai[bot] commented 1 month ago

[!IMPORTANT]

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

The recent updates enhance ZenML's logging capabilities by shifting from file-based to folder-based log handling. Key changes include the introduction of functions and properties to manage log folders, refactoring of existing methods, and new tests to verify logging behavior. This ensures more efficient log retrieval and storage within ZenML pipelines.

Changes

File(s) Summary
src/zenml/logging/step_logging.py Added TemporaryDirectory import, refactored log handling to use folders, introduced new functions and constants.
src/zenml/orchestrators/step_launcher.py Renamed prepare_logs_uri to prepare_logs_folder_uri in the launch method.
src/zenml/zen_server/routers/steps_endpoints.py Refactored get_step_logs to use fetch_logs for log retrieval.
tests/integration/functional/steps/test_logging.py Added new functions and tests for verifying log merging and deletion behavior.
tests/integration/functional/zen_stores/test_zen_store.py Replaced artifact store functions with fetch_logs and prepare_logs_folder_uri in tests.

Sequence Diagram(s) (Beta)

sequenceDiagram
    participant User
    participant StepLogsStorage
    participant ArtifactStore
    participant StepLauncher

    User->>StepLauncher: launch()
    StepLauncher->>StepLogsStorage: prepare_logs_folder_uri()
    StepLogsStorage->>ArtifactStore: create_log_folder()
    StepLauncher->>StepLogsStorage: write()
    StepLogsStorage->>ArtifactStore: save_logs()
    User->>StepLogsStorage: fetch_logs()
    StepLogsStorage->>ArtifactStore: retrieve_logs()
    ArtifactStore->>User: return logs

Poem

In folders now our logs reside,
Where once in files they used to hide.
With merging, fetching, all anew,
ZenML logs, more clear to view.
A rabbit's joy in code so neat,
Efficiency, a coder's treat!


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share - [X](https://twitter.com/intent/tweet?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A&url=https%3A//coderabbit.ai) - [Mastodon](https://mastodon.social/share?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A%20https%3A%2F%2Fcoderabbit.ai) - [Reddit](https://www.reddit.com/submit?title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&text=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code.%20Check%20it%20out%3A%20https%3A//coderabbit.ai) - [LinkedIn](https://www.linkedin.com/sharing/share-offsite/?url=https%3A%2F%2Fcoderabbit.ai&mini=true&title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&summary=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code)
Tips ### Chat There are 3 ways to chat with [CodeRabbit](https://coderabbit.ai): - Review comments: Directly reply to a review comment made by CodeRabbit. Example: - `I pushed a fix in commit .` - `Generate unit testing code for this file.` - `Open a follow-up GitHub issue for this discussion.` - Files and specific lines of code (under the "Files changed" tab): Tag `@coderabbitai` in a new review comment at the desired location with your query. Examples: - `@coderabbitai generate unit testing code for this file.` - `@coderabbitai modularize this function.` - PR comments: Tag `@coderabbitai` in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples: - `@coderabbitai generate interesting stats about this repository and render them as a table.` - `@coderabbitai show all the console.log statements in this repository.` - `@coderabbitai read src/utils.ts and generate unit testing code.` - `@coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.` - `@coderabbitai help me debug CodeRabbit configuration file.` Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. ### CodeRabbit Commands (invoked as PR comments) - `@coderabbitai pause` to pause the reviews on a PR. - `@coderabbitai resume` to resume the paused reviews. - `@coderabbitai review` to trigger an incremental review. This is useful when automatic reviews are disabled for the repository. - `@coderabbitai full review` to do a full review from scratch and review all the files again. - `@coderabbitai summary` to regenerate the summary of the PR. - `@coderabbitai resolve` resolve all the CodeRabbit review comments. - `@coderabbitai configuration` to show the current CodeRabbit configuration for the repository. - `@coderabbitai help` to get help. Additionally, you can add `@coderabbitai ignore` anywhere in the PR description to prevent this PR from being reviewed. ### CodeRabbit Configration File (`.coderabbit.yaml`) - You can programmatically configure CodeRabbit by adding a `.coderabbit.yaml` file to the root of your repository. - Please see the [configuration documentation](https://docs.coderabbit.ai/guides/configure-coderabbit) for more information. - If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: `# yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json` ### Documentation and Community - Visit our [Documentation](https://coderabbit.ai/docs) for detailed information on how to use CodeRabbit. - Join our [Discord Community](https://discord.com/invite/GsXnASn26c) to get help, request features, and share feedback. - Follow us on [X/Twitter](https://twitter.com/coderabbitai) for updates and announcements.
htahir1 commented 1 month ago

@avishniakov Thanks!! This is an interesting fix. Can you maybe try to run a pipeline with some sort of tqdm progress bar (any keras example with .fit() will do)... this will also tell us if this branch fixes that problem?

Also, for a standard pipeline, can you make a table of how performance degrades over log size and perhaps we can leave it here for future reference?

avishniakov commented 1 month ago

@avishniakov Thanks!! This is an interesting fix. Can you maybe try to run a pipeline with some sort of tqdm progress bar (any keras example with .fit() will do)... this will also tell us if this branch fixes that problem?

Also, for a standard pipeline, can you make a table of how performance degrades over log size and perhaps we can leave it here for future reference?

The Keras logs looks like this. Some TQDM events are lost, but I would not care much. The speed is 23.67179584503174 in pipeline mode and 16.098801851272583 by just calling a function, taking into account that pipeline was using GCP artifact store - quite good, IMO.

Caching disabled explicitly for run_mnist.

Step run_mnist has started.

x_train shape:

(60000, 28, 28, 1)
60000

train samples
10000

test samples
Model: "sequential"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                         ┃ Output Shape                ┃         Param # ┃
┑━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
β”‚ conv2d (Conv2D)                      β”‚ (None, 26, 26, 32)          β”‚             320 β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ max_pooling2d (MaxPooling2D)         β”‚ (None, 13, 13, 32)          β”‚               0 β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ conv2d_1 (Conv2D)                    β”‚ (None, 11, 11, 64)          β”‚          18,496 β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ max_pooling2d_1 (MaxPooling2D)       β”‚ (None, 5, 5, 64)            β”‚               0 β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ flatten (Flatten)                    β”‚ (None, 1600)                β”‚               0 β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ dropout (Dropout)                    β”‚ (None, 1600)                β”‚               0 β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ dense (Dense)                        β”‚ (None, 10)                  β”‚          16,010 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

 Total params: 34,826 (136.04 KB)

 Trainable params: 34,826 (136.04 KB)

 Non-trainable params: 0 (0.00 B)

Epoch 1/2

  1/422 ━━━━━━━━━━━━━━━━━━━━ 5:24 772ms/step - accuracy: 0.0859 - loss: 2.3276

  2/422 ━━━━━━━━━━━━━━━━━━━━ 23s 56ms/step - accuracy: 0.0859 - loss: 2.3292  

  4/422 ━━━━━━━━━━━━━━━━━━━━ 16s 39ms/step - accuracy: 0.1076 - loss: 2.3122

  6/422 ━━━━━━━━━━━━━━━━━━━━ 15s 38ms/step - accuracy: 0.1235 - loss: 2.2968

  9/422 ━━━━━━━━━━━━━━━━━━━━ 13s 32ms/step - accuracy: 0.1429 - loss: 2.2777

 11/422 ━━━━━━━━━━━━━━━━━━━━ 12s 31ms/step - accuracy: 0.1557 - loss: 2.2646

 13/422 ━━━━━━━━━━━━━━━━━━━━ 12s 32ms/step - accuracy: 0.1674 - loss: 2.2512

 15/422 ━━━━━━━━━━━━━━━━━━━━ 12s 32ms/step - accuracy: 0.1774 - loss: 2.2381

 19/422 ━━━━━━━━━━━━━━━━━━━━ 11s 28ms/step - accuracy: 0.2000 - loss: 2.2089

 23/422 ━━━━━━━━━━━━━━━━━━━━ 10s 25ms/step - accuracy: 0.2256 - loss: 2.1744

 27/422 ━━━━━━━━━━━━━━━━━━━━ 9s 24ms/step - accuracy: 0.2504 - loss: 2.1360 

 31/422 ━━━━━━━━━━━━━━━━━━━━ 8s 23ms/step - accuracy: 0.2746 - loss: 2.0941

 35/422 ━━━━━━━━━━━━━━━━━━━━ 8s 22ms/step - accuracy: 0.2970 - loss: 2.0502

 39/422 ━━━━━━━━━━━━━━━━━━━━ 7s 21ms/step - accuracy: 0.3179 - loss: 2.0051

 43/422 ━━━━━━━━━━━━━━━━━━━━ 7s 20ms/step - accuracy: 0.3372 - loss: 1.9606

 46/422 ━━━━━━━━━━━━━━━━━━━━ 7s 20ms/step - accuracy: 0.3508 - loss: 1.9283

 50/422 ━━━━━━━━━━━━━━━━━━━━ 7s 20ms/step - accuracy: 0.3677 - loss: 1.8864

 54/422 ━━━━━━━━━━━━━━━━━━━━ 7s 19ms/step - accuracy: 0.3836 - loss: 1.8460

 57/422 ━━━━━━━━━━━━━━━━━━━━ 7s 20ms/step - accuracy: 0.3947 - loss: 1.8172

 60/422 ━━━━━━━━━━━━━━━━━━━━ 7s 20ms/step - accuracy: 0.4054 - loss: 1.7894

 63/422 ━━━━━━━━━━━━━━━━━━━━ 7s 20ms/step - accuracy: 0.4156 - loss: 1.7625

 67/422 ━━━━━━━━━━━━━━━━━━━━ 6s 19ms/step - accuracy: 0.4287 - loss: 1.7278

 71/422 ━━━━━━━━━━━━━━━━━━━━ 6s 19ms/step - accuracy: 0.4409 - loss: 1.6947

 75/422 ━━━━━━━━━━━━━━━━━━━━ 6s 19ms/step - accuracy: 0.4525 - loss: 1.6631

 79/422 ━━━━━━━━━━━━━━━━━━━━ 6s 18ms/step - accuracy: 0.4634 - loss: 1.6330

 83/422 ━━━━━━━━━━━━━━━━━━━━ 6s 18ms/step - accuracy: 0.4738 - loss: 1.6042

 86/422 ━━━━━━━━━━━━━━━━━━━━ 6s 18ms/step - accuracy: 0.4812 - loss: 1.5834

 90/422 ━━━━━━━━━━━━━━━━━━━━ 5s 18ms/step - accuracy: 0.4907 - loss: 1.5567

 94/422 ━━━━━━━━━━━━━━━━━━━━ 5s 18ms/step - accuracy: 0.4998 - loss: 1.5312

 98/422 ━━━━━━━━━━━━━━━━━━━━ 5s 18ms/step - accuracy: 0.5084 - loss: 1.5067

102/422 ━━━━━━━━━━━━━━━━━━━━ 5s 18ms/step - accuracy: 0.5166 - loss: 1.4831

106/422 ━━━━━━━━━━━━━━━━━━━━ 5s 17ms/step - accuracy: 0.5244 - loss: 1.4606

110/422 ━━━━━━━━━━━━━━━━━━━━ 5s 17ms/step - accuracy: 0.5319 - loss: 1.4389

114/422 ━━━━━━━━━━━━━━━━━━━━ 5s 17ms/step - accuracy: 0.5391 - loss: 1.4182

118/422 ━━━━━━━━━━━━━━━━━━━━ 5s 17ms/step - accuracy: 0.5460 - loss: 1.3982

122/422 ━━━━━━━━━━━━━━━━━━━━ 5s 17ms/step - accuracy: 0.5526 - loss: 1.3790

126/422 ━━━━━━━━━━━━━━━━━━━━ 4s 17ms/step - accuracy: 0.5589 - loss: 1.3605

130/422 ━━━━━━━━━━━━━━━━━━━━ 4s 17ms/step - accuracy: 0.5651 - loss: 1.3427

134/422 ━━━━━━━━━━━━━━━━━━━━ 4s 17ms/step - accuracy: 0.5710 - loss: 1.3254

138/422 ━━━━━━━━━━━━━━━━━━━━ 4s 17ms/step - accuracy: 0.5766 - loss: 1.3087

142/422 ━━━━━━━━━━━━━━━━━━━━ 4s 17ms/step - accuracy: 0.5821 - loss: 1.2926

146/422 ━━━━━━━━━━━━━━━━━━━━ 4s 16ms/step - accuracy: 0.5874 - loss: 1.2769

150/422 ━━━━━━━━━━━━━━━━━━━━ 4s 16ms/step - accuracy: 0.5926 - loss: 1.2618

154/422 ━━━━━━━━━━━━━━━━━━━━ 4s 16ms/step - accuracy: 0.5975 - loss: 1.2471

158/422 ━━━━━━━━━━━━━━━━━━━━ 4s 16ms/step - accuracy: 0.6023 - loss: 1.2329

162/422 ━━━━━━━━━━━━━━━━━━━━ 4s 16ms/step - accuracy: 0.6070 - loss: 1.2191

166/422 ━━━━━━━━━━━━━━━━━━━━ 4s 16ms/step - accuracy: 0.6115 - loss: 1.2057

170/422 ━━━━━━━━━━━━━━━━━━━━ 4s 16ms/step - accuracy: 0.6159 - loss: 1.1928

174/422 ━━━━━━━━━━━━━━━━━━━━ 3s 16ms/step - accuracy: 0.6201 - loss: 1.1802

178/422 ━━━━━━━━━━━━━━━━━━━━ 3s 16ms/step - accuracy: 0.6242 - loss: 1.1680

182/422 ━━━━━━━━━━━━━━━━━━━━ 3s 16ms/step - accuracy: 0.6282 - loss: 1.1560

186/422 ━━━━━━━━━━━━━━━━━━━━ 3s 16ms/step - accuracy: 0.6321 - loss: 1.1445

190/422 ━━━━━━━━━━━━━━━━━━━━ 3s 16ms/step - accuracy: 0.6359 - loss: 1.1332

194/422 ━━━━━━━━━━━━━━━━━━━━ 3s 16ms/step - accuracy: 0.6395 - loss: 1.1222

198/422 ━━━━━━━━━━━━━━━━━━━━ 3s 16ms/step - accuracy: 0.6431 - loss: 1.1116

201/422 ━━━━━━━━━━━━━━━━━━━━ 3s 16ms/step - accuracy: 0.6457 - loss: 1.1038

204/422 ━━━━━━━━━━━━━━━━━━━━ 3s 16ms/step - accuracy: 0.6483 - loss: 1.0961

207/422 ━━━━━━━━━━━━━━━━━━━━ 3s 16ms/step - accuracy: 0.6508 - loss: 1.0886

211/422 ━━━━━━━━━━━━━━━━━━━━ 3s 16ms/step - accuracy: 0.6541 - loss: 1.0789

215/422 ━━━━━━━━━━━━━━━━━━━━ 3s 16ms/step - accuracy: 0.6573 - loss: 1.0693

219/422 ━━━━━━━━━━━━━━━━━━━━ 3s 16ms/step - accuracy: 0.6604 - loss: 1.0600

223/422 ━━━━━━━━━━━━━━━━━━━━ 3s 16ms/step - accuracy: 0.6635 - loss: 1.0509

227/422 ━━━━━━━━━━━━━━━━━━━━ 3s 16ms/step - accuracy: 0.6664 - loss: 1.0419

231/422 ━━━━━━━━━━━━━━━━━━━━ 2s 16ms/step - accuracy: 0.6693 - loss: 1.0332

235/422 ━━━━━━━━━━━━━━━━━━━━ 2s 16ms/step - accuracy: 0.6722 - loss: 1.0247

239/422 ━━━━━━━━━━━━━━━━━━━━ 2s 16ms/step - accuracy: 0.6749 - loss: 1.0164

243/422 ━━━━━━━━━━━━━━━━━━━━ 2s 16ms/step - accuracy: 0.6776 - loss: 1.0083

247/422 ━━━━━━━━━━━━━━━━━━━━ 2s 16ms/step - accuracy: 0.6803 - loss: 1.0003

251/422 ━━━━━━━━━━━━━━━━━━━━ 2s 16ms/step - accuracy: 0.6829 - loss: 0.9926

255/422 ━━━━━━━━━━━━━━━━━━━━ 2s 16ms/step - accuracy: 0.6854 - loss: 0.9849

259/422 ━━━━━━━━━━━━━━━━━━━━ 2s 15ms/step - accuracy: 0.6879 - loss: 0.9774

263/422 ━━━━━━━━━━━━━━━━━━━━ 2s 15ms/step - accuracy: 0.6903 - loss: 0.9701

267/422 ━━━━━━━━━━━━━━━━━━━━ 2s 15ms/step - accuracy: 0.6927 - loss: 0.9629

271/422 ━━━━━━━━━━━━━━━━━━━━ 2s 15ms/step - accuracy: 0.6950 - loss: 0.9559

275/422 ━━━━━━━━━━━━━━━━━━━━ 2s 15ms/step - accuracy: 0.6973 - loss: 0.9490

278/422 ━━━━━━━━━━━━━━━━━━━━ 2s 15ms/step - accuracy: 0.6990 - loss: 0.9439

282/422 ━━━━━━━━━━━━━━━━━━━━ 2s 15ms/step - accuracy: 0.7012 - loss: 0.9373

286/422 ━━━━━━━━━━━━━━━━━━━━ 2s 15ms/step - accuracy: 0.7033 - loss: 0.9307

290/422 ━━━━━━━━━━━━━━━━━━━━ 2s 15ms/step - accuracy: 0.7054 - loss: 0.9243

294/422 ━━━━━━━━━━━━━━━━━━━━ 1s 15ms/step - accuracy: 0.7075 - loss: 0.9181

298/422 ━━━━━━━━━━━━━━━━━━━━ 1s 15ms/step - accuracy: 0.7095 - loss: 0.9119

302/422 ━━━━━━━━━━━━━━━━━━━━ 1s 15ms/step - accuracy: 0.7115 - loss: 0.9058

306/422 ━━━━━━━━━━━━━━━━━━━━ 1s 15ms/step - accuracy: 0.7135 - loss: 0.8999

307/422 ━━━━━━━━━━━━━━━━━━━━ 1s 16ms/step - accuracy: 0.7140 - loss: 0.8984

310/422 ━━━━━━━━━━━━━━━━━━━━ 1s 16ms/step - accuracy: 0.7154 - loss: 0.8941

314/422 ━━━━━━━━━━━━━━━━━━━━ 1s 16ms/step - accuracy: 0.7173 - loss: 0.8884

318/422 ━━━━━━━━━━━━━━━━━━━━ 1s 16ms/step - accuracy: 0.7191 - loss: 0.8828

322/422 ━━━━━━━━━━━━━━━━━━━━ 1s 16ms/step - accuracy: 0.7209 - loss: 0.8772

326/422 ━━━━━━━━━━━━━━━━━━━━ 1s 16ms/step - accuracy: 0.7227 - loss: 0.8718

330/422 ━━━━━━━━━━━━━━━━━━━━ 1s 16ms/step - accuracy: 0.7245 - loss: 0.8665

334/422 ━━━━━━━━━━━━━━━━━━━━ 1s 16ms/step - accuracy: 0.7262 - loss: 0.8612

338/422 ━━━━━━━━━━━━━━━━━━━━ 1s 16ms/step - accuracy: 0.7279 - loss: 0.8561

342/422 ━━━━━━━━━━━━━━━━━━━━ 1s 16ms/step - accuracy: 0.7296 - loss: 0.8510

346/422 ━━━━━━━━━━━━━━━━━━━━ 1s 16ms/step - accuracy: 0.7312 - loss: 0.8460

350/422 ━━━━━━━━━━━━━━━━━━━━ 1s 16ms/step - accuracy: 0.7328 - loss: 0.8411

354/422 ━━━━━━━━━━━━━━━━━━━━ 1s 16ms/step - accuracy: 0.7344 - loss: 0.8363

358/422 ━━━━━━━━━━━━━━━━━━━━ 1s 16ms/step - accuracy: 0.7359 - loss: 0.8315

362/422 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - accuracy: 0.7375 - loss: 0.8269

365/422 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - accuracy: 0.7386 - loss: 0.8234

368/422 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - accuracy: 0.7397 - loss: 0.8200

372/422 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - accuracy: 0.7412 - loss: 0.8155

376/422 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - accuracy: 0.7426 - loss: 0.8111

380/422 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - accuracy: 0.7440 - loss: 0.8067

384/422 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - accuracy: 0.7454 - loss: 0.8024

388/422 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - accuracy: 0.7468 - loss: 0.7982

392/422 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - accuracy: 0.7482 - loss: 0.7940

395/422 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - accuracy: 0.7492 - loss: 0.7909

398/422 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - accuracy: 0.7502 - loss: 0.7879

402/422 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - accuracy: 0.7515 - loss: 0.7839

405/422 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - accuracy: 0.7525 - loss: 0.7809

409/422 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - accuracy: 0.7538 - loss: 0.7770

413/422 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - accuracy: 0.7550 - loss: 0.7731

417/422 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - accuracy: 0.7563 - loss: 0.7693

421/422 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - accuracy: 0.7575 - loss: 0.7655

422/422 ━━━━━━━━━━━━━━━━━━━━ 8s 16ms/step - accuracy: 0.7581 - loss: 0.7637 - val_accuracy: 0.9757 - val_loss: 0.0904

Epoch 2/2

  1/422 ━━━━━━━━━━━━━━━━━━━━ 11s 27ms/step - accuracy: 0.9766 - loss: 0.0878

  5/422 ━━━━━━━━━━━━━━━━━━━━ 6s 15ms/step - accuracy: 0.9614 - loss: 0.1319 

  9/422 ━━━━━━━━━━━━━━━━━━━━ 6s 15ms/step - accuracy: 0.9581 - loss: 0.1412

 13/422 ━━━━━━━━━━━━━━━━━━━━ 6s 15ms/step - accuracy: 0.9563 - loss: 0.1433

 17/422 ━━━━━━━━━━━━━━━━━━━━ 5s 15ms/step - accuracy: 0.9559 - loss: 0.1425

 21/422 ━━━━━━━━━━━━━━━━━━━━ 5s 15ms/step - accuracy: 0.9559 - loss: 0.1417

 25/422 ━━━━━━━━━━━━━━━━━━━━ 5s 15ms/step - accuracy: 0.9560 - loss: 0.1400

 29/422 ━━━━━━━━━━━━━━━━━━━━ 5s 15ms/step - accuracy: 0.9564 - loss: 0.1383

 33/422 ━━━━━━━━━━━━━━━━━━━━ 5s 15ms/step - accuracy: 0.9566 - loss: 0.1372

 37/422 ━━━━━━━━━━━━━━━━━━━━ 5s 15ms/step - accuracy: 0.9567 - loss: 0.1363

 41/422 ━━━━━━━━━━━━━━━━━━━━ 5s 15ms/step - accuracy: 0.9569 - loss: 0.1355

 45/422 ━━━━━━━━━━━━━━━━━━━━ 5s 15ms/step - accuracy: 0.9571 - loss: 0.1349

 49/422 ━━━━━━━━━━━━━━━━━━━━ 5s 15ms/step - accuracy: 0.9573 - loss: 0.1347

 53/422 ━━━━━━━━━━━━━━━━━━━━ 5s 15ms/step - accuracy: 0.9573 - loss: 0.1346

 56/422 ━━━━━━━━━━━━━━━━━━━━ 5s 15ms/step - accuracy: 0.9574 - loss: 0.1344

 60/422 ━━━━━━━━━━━━━━━━━━━━ 5s 15ms/step - accuracy: 0.9575 - loss: 0.1342

 64/422 ━━━━━━━━━━━━━━━━━━━━ 5s 15ms/step - accuracy: 0.9576 - loss: 0.1340

 68/422 ━━━━━━━━━━━━━━━━━━━━ 5s 15ms/step - accuracy: 0.9576 - loss: 0.1339

 71/422 ━━━━━━━━━━━━━━━━━━━━ 5s 15ms/step - accuracy: 0.9577 - loss: 0.1339

 75/422 ━━━━━━━━━━━━━━━━━━━━ 5s 15ms/step - accuracy: 0.9577 - loss: 0.1338

 77/422 ━━━━━━━━━━━━━━━━━━━━ 5s 15ms/step - accuracy: 0.9578 - loss: 0.1338

 79/422 ━━━━━━━━━━━━━━━━━━━━ 5s 16ms/step - accuracy: 0.9578 - loss: 0.1337

 83/422 ━━━━━━━━━━━━━━━━━━━━ 5s 16ms/step - accuracy: 0.9578 - loss: 0.1337

 87/422 ━━━━━━━━━━━━━━━━━━━━ 5s 16ms/step - accuracy: 0.9579 - loss: 0.1336

 91/422 ━━━━━━━━━━━━━━━━━━━━ 5s 16ms/step - accuracy: 0.9579 - loss: 0.1334

 94/422 ━━━━━━━━━━━━━━━━━━━━ 5s 16ms/step - accuracy: 0.9580 - loss: 0.1333

 98/422 ━━━━━━━━━━━━━━━━━━━━ 5s 16ms/step - accuracy: 0.9580 - loss: 0.1333

102/422 ━━━━━━━━━━━━━━━━━━━━ 4s 15ms/step - accuracy: 0.9581 - loss: 0.1332

106/422 ━━━━━━━━━━━━━━━━━━━━ 4s 15ms/step - accuracy: 0.9581 - loss: 0.1331

110/422 ━━━━━━━━━━━━━━━━━━━━ 4s 15ms/step - accuracy: 0.9582 - loss: 0.1330

114/422 ━━━━━━━━━━━━━━━━━━━━ 4s 15ms/step - accuracy: 0.9582 - loss: 0.1328

118/422 ━━━━━━━━━━━━━━━━━━━━ 4s 15ms/step - accuracy: 0.9583 - loss: 0.1327

122/422 ━━━━━━━━━━━━━━━━━━━━ 4s 15ms/step - accuracy: 0.9583 - loss: 0.1326

126/422 ━━━━━━━━━━━━━━━━━━━━ 4s 15ms/step - accuracy: 0.9584 - loss: 0.1324

130/422 ━━━━━━━━━━━━━━━━━━━━ 4s 15ms/step - accuracy: 0.9585 - loss: 0.1323

134/422 ━━━━━━━━━━━━━━━━━━━━ 4s 15ms/step - accuracy: 0.9585 - loss: 0.1322

138/422 ━━━━━━━━━━━━━━━━━━━━ 4s 15ms/step - accuracy: 0.9586 - loss: 0.1320

142/422 ━━━━━━━━━━━━━━━━━━━━ 4s 15ms/step - accuracy: 0.9587 - loss: 0.1319

146/422 ━━━━━━━━━━━━━━━━━━━━ 4s 15ms/step - accuracy: 0.9587 - loss: 0.1317

149/422 ━━━━━━━━━━━━━━━━━━━━ 4s 15ms/step - accuracy: 0.9588 - loss: 0.1316

152/422 ━━━━━━━━━━━━━━━━━━━━ 4s 15ms/step - accuracy: 0.9588 - loss: 0.1315

156/422 ━━━━━━━━━━━━━━━━━━━━ 4s 15ms/step - accuracy: 0.9588 - loss: 0.1314

160/422 ━━━━━━━━━━━━━━━━━━━━ 3s 15ms/step - accuracy: 0.9589 - loss: 0.1313

164/422 ━━━━━━━━━━━━━━━━━━━━ 3s 15ms/step - accuracy: 0.9589 - loss: 0.1312

168/422 ━━━━━━━━━━━━━━━━━━━━ 3s 15ms/step - accuracy: 0.9590 - loss: 0.1311

172/422 ━━━━━━━━━━━━━━━━━━━━ 3s 15ms/step - accuracy: 0.9590 - loss: 0.1310

176/422 ━━━━━━━━━━━━━━━━━━━━ 3s 15ms/step - accuracy: 0.9591 - loss: 0.1308

180/422 ━━━━━━━━━━━━━━━━━━━━ 3s 15ms/step - accuracy: 0.9591 - loss: 0.1307

184/422 ━━━━━━━━━━━━━━━━━━━━ 3s 15ms/step - accuracy: 0.9591 - loss: 0.1306

188/422 ━━━━━━━━━━━━━━━━━━━━ 3s 15ms/step - accuracy: 0.9592 - loss: 0.1305

192/422 ━━━━━━━━━━━━━━━━━━━━ 3s 15ms/step - accuracy: 0.9592 - loss: 0.1305

196/422 ━━━━━━━━━━━━━━━━━━━━ 3s 15ms/step - accuracy: 0.9592 - loss: 0.1304

200/422 ━━━━━━━━━━━━━━━━━━━━ 3s 15ms/step - accuracy: 0.9593 - loss: 0.1303

204/422 ━━━━━━━━━━━━━━━━━━━━ 3s 15ms/step - accuracy: 0.9593 - loss: 0.1302

206/422 ━━━━━━━━━━━━━━━━━━━━ 3s 15ms/step - accuracy: 0.9593 - loss: 0.1302

210/422 ━━━━━━━━━━━━━━━━━━━━ 3s 15ms/step - accuracy: 0.9593 - loss: 0.1301

214/422 ━━━━━━━━━━━━━━━━━━━━ 3s 15ms/step - accuracy: 0.9594 - loss: 0.1300

218/422 ━━━━━━━━━━━━━━━━━━━━ 3s 15ms/step - accuracy: 0.9594 - loss: 0.1299

222/422 ━━━━━━━━━━━━━━━━━━━━ 2s 15ms/step - accuracy: 0.9594 - loss: 0.1298

226/422 ━━━━━━━━━━━━━━━━━━━━ 2s 15ms/step - accuracy: 0.9594 - loss: 0.1298

230/422 ━━━━━━━━━━━━━━━━━━━━ 2s 15ms/step - accuracy: 0.9595 - loss: 0.1297

234/422 ━━━━━━━━━━━━━━━━━━━━ 2s 15ms/step - accuracy: 0.9595 - loss: 0.1296

238/422 ━━━━━━━━━━━━━━━━━━━━ 2s 15ms/step - accuracy: 0.9595 - loss: 0.1295

242/422 ━━━━━━━━━━━━━━━━━━━━ 2s 15ms/step - accuracy: 0.9595 - loss: 0.1294

246/422 ━━━━━━━━━━━━━━━━━━━━ 2s 15ms/step - accuracy: 0.9596 - loss: 0.1293

250/422 ━━━━━━━━━━━━━━━━━━━━ 2s 15ms/step - accuracy: 0.9596 - loss: 0.1293

254/422 ━━━━━━━━━━━━━━━━━━━━ 2s 15ms/step - accuracy: 0.9596 - loss: 0.1292

255/422 ━━━━━━━━━━━━━━━━━━━━ 2s 16ms/step - accuracy: 0.9596 - loss: 0.1292

259/422 ━━━━━━━━━━━━━━━━━━━━ 2s 16ms/step - accuracy: 0.9597 - loss: 0.1291

263/422 ━━━━━━━━━━━━━━━━━━━━ 2s 16ms/step - accuracy: 0.9597 - loss: 0.1290

267/422 ━━━━━━━━━━━━━━━━━━━━ 2s 16ms/step - accuracy: 0.9597 - loss: 0.1289

271/422 ━━━━━━━━━━━━━━━━━━━━ 2s 16ms/step - accuracy: 0.9597 - loss: 0.1288

275/422 ━━━━━━━━━━━━━━━━━━━━ 2s 16ms/step - accuracy: 0.9598 - loss: 0.1287

279/422 ━━━━━━━━━━━━━━━━━━━━ 2s 16ms/step - accuracy: 0.9598 - loss: 0.1287

283/422 ━━━━━━━━━━━━━━━━━━━━ 2s 16ms/step - accuracy: 0.9598 - loss: 0.1286

287/422 ━━━━━━━━━━━━━━━━━━━━ 2s 16ms/step - accuracy: 0.9599 - loss: 0.1285

289/422 ━━━━━━━━━━━━━━━━━━━━ 2s 16ms/step - accuracy: 0.9599 - loss: 0.1284

291/422 ━━━━━━━━━━━━━━━━━━━━ 2s 16ms/step - accuracy: 0.9599 - loss: 0.1284

294/422 ━━━━━━━━━━━━━━━━━━━━ 2s 16ms/step - accuracy: 0.9599 - loss: 0.1283

296/422 ━━━━━━━━━━━━━━━━━━━━ 2s 16ms/step - accuracy: 0.9599 - loss: 0.1283

299/422 ━━━━━━━━━━━━━━━━━━━━ 1s 16ms/step - accuracy: 0.9600 - loss: 0.1282

303/422 ━━━━━━━━━━━━━━━━━━━━ 1s 16ms/step - accuracy: 0.9600 - loss: 0.1281

307/422 ━━━━━━━━━━━━━━━━━━━━ 1s 16ms/step - accuracy: 0.9600 - loss: 0.1280

311/422 ━━━━━━━━━━━━━━━━━━━━ 1s 16ms/step - accuracy: 0.9601 - loss: 0.1279

314/422 ━━━━━━━━━━━━━━━━━━━━ 1s 16ms/step - accuracy: 0.9601 - loss: 0.1279

317/422 ━━━━━━━━━━━━━━━━━━━━ 1s 16ms/step - accuracy: 0.9601 - loss: 0.1278

321/422 ━━━━━━━━━━━━━━━━━━━━ 1s 16ms/step - accuracy: 0.9602 - loss: 0.1277

325/422 ━━━━━━━━━━━━━━━━━━━━ 1s 16ms/step - accuracy: 0.9602 - loss: 0.1276

329/422 ━━━━━━━━━━━━━━━━━━━━ 1s 16ms/step - accuracy: 0.9602 - loss: 0.1275

333/422 ━━━━━━━━━━━━━━━━━━━━ 1s 16ms/step - accuracy: 0.9603 - loss: 0.1274

337/422 ━━━━━━━━━━━━━━━━━━━━ 1s 16ms/step - accuracy: 0.9603 - loss: 0.1273

341/422 ━━━━━━━━━━━━━━━━━━━━ 1s 16ms/step - accuracy: 0.9603 - loss: 0.1272

345/422 ━━━━━━━━━━━━━━━━━━━━ 1s 16ms/step - accuracy: 0.9604 - loss: 0.1271

348/422 ━━━━━━━━━━━━━━━━━━━━ 1s 16ms/step - accuracy: 0.9604 - loss: 0.1270

352/422 ━━━━━━━━━━━━━━━━━━━━ 1s 16ms/step - accuracy: 0.9604 - loss: 0.1270

356/422 ━━━━━━━━━━━━━━━━━━━━ 1s 16ms/step - accuracy: 0.9605 - loss: 0.1269

360/422 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - accuracy: 0.9605 - loss: 0.1268

364/422 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - accuracy: 0.9605 - loss: 0.1267

368/422 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - accuracy: 0.9606 - loss: 0.1266

372/422 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - accuracy: 0.9606 - loss: 0.1265

376/422 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - accuracy: 0.9606 - loss: 0.1264

380/422 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - accuracy: 0.9607 - loss: 0.1263

384/422 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - accuracy: 0.9607 - loss: 0.1262

388/422 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - accuracy: 0.9607 - loss: 0.1261

392/422 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - accuracy: 0.9608 - loss: 0.1260

395/422 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - accuracy: 0.9608 - loss: 0.1259

397/422 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - accuracy: 0.9608 - loss: 0.1259

398/422 ━━━━━━━━━━━━━━━━━━━━ 0s 17ms/step - accuracy: 0.9608 - loss: 0.1259

399/422 ━━━━━━━━━━━━━━━━━━━━ 0s 17ms/step - accuracy: 0.9608 - loss: 0.1258

402/422 ━━━━━━━━━━━━━━━━━━━━ 0s 17ms/step - accuracy: 0.9608 - loss: 0.1258

406/422 ━━━━━━━━━━━━━━━━━━━━ 0s 17ms/step - accuracy: 0.9609 - loss: 0.1257

410/422 ━━━━━━━━━━━━━━━━━━━━ 0s 17ms/step - accuracy: 0.9609 - loss: 0.1256

414/422 ━━━━━━━━━━━━━━━━━━━━ 0s 17ms/step - accuracy: 0.9609 - loss: 0.1255

417/422 ━━━━━━━━━━━━━━━━━━━━ 0s 17ms/step - accuracy: 0.9610 - loss: 0.1254

421/422 ━━━━━━━━━━━━━━━━━━━━ 0s 17ms/step - accuracy: 0.9610 - loss: 0.1254

422/422 ━━━━━━━━━━━━━━━━━━━━ 7s 17ms/step - accuracy: 0.9610 - loss: 0.1253 - val_accuracy: 0.9848 - val_loss: 0.0589

Test loss:

0.057743266224861145
Test accuracy:

0.9812999963760376
Step run_mnist has finished in 17.550s.

Step run_mnist completed successfully.
htahir1 commented 1 month ago

@avishniakov Sounds good. Can we do it for more epochs even just to test it?

strickvl commented 1 month ago

Linked to https://github.com/zenml-io/zenml/issues/2211

avishniakov commented 1 month ago

@htahir1 @strickvl added some numbers into the PR description, as requested. I can confirm that TQDM is resolved and overall performance for logging intense tasks is way better. For not logging heavy jobs without TQDM the improvement would be quite minimal though, but the GCP logging is fixed from now on.

htahir1 commented 1 month ago

@avishniakov thats fantastic !! Love it

strickvl commented 1 month ago

@coderabbitai review

coderabbitai[bot] commented 1 month ago
Actions performed Review triggered. > Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.
avishniakov commented 1 month ago

@schustmi @stefannica I quite heavily revisited fetching logic, used lighter ops, like size, and also supported negative offsets. Please have a look

avishniakov commented 1 month ago

A few tests failed during the Install steps - will run all on release already.