awslabs / sagemaker-debugger

Amazon SageMaker Debugger provides functionality to save tensors during training of machine learning jobs and analyze those tensors
Apache License 2.0
161 stars 83 forks source link

Add support for dataloader profiling with native TF2 training #504

Closed ndodda-amazon closed 3 years ago

ndodda-amazon commented 3 years ago

Description of changes:

Followup to #500 to add dataloader profiling support for native TF2 training. Also handles writing the timeline files at the end of the step.

Style and formatting:

I have run pre-commit install to ensure that auto-formatting happens with every commit.

Issue number, if available

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

codecov-commenter commented 3 years ago

Codecov Report

Merging #504 (9d0539f) into master (347727e) will increase coverage by 0.23%. The diff coverage is 0.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #504      +/-   ##
==========================================
+ Coverage   55.48%   55.71%   +0.23%     
==========================================
  Files         116      116              
  Lines       10517    10526       +9     
==========================================
+ Hits         5835     5865      +30     
+ Misses       4682     4661      -21     
Impacted Files Coverage Δ
smdebug/tensorflow/keras.py 0.00% <0.00%> (ø)
smdebug/core/writer.py 79.27% <0.00%> (-6.31%) :arrow_down:
smdebug/core/logger.py 70.83% <0.00%> (-5.56%) :arrow_down:
smdebug/analysis/utils.py 19.60% <0.00%> (-3.93%) :arrow_down:
smdebug/core/tfevent/index_file_writer.py 92.18% <0.00%> (-1.57%) :arrow_down:
smdebug/core/index_reader.py 61.63% <0.00%> (+0.43%) :arrow_up:
smdebug/xgboost/hook.py 1.05% <0.00%> (+1.05%) :arrow_up:
smdebug/core/locations.py 80.55% <0.00%> (+1.38%) :arrow_up:
smdebug/core/tfrecord/tensor_reader.py 96.96% <0.00%> (+1.51%) :arrow_up:
smdebug/core/tensor.py 65.72% <0.00%> (+1.61%) :arrow_up:
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 347727e...9d0539f. Read the comment docs.