awslabs / sagemaker-debugger

Amazon SageMaker Debugger provides functionality to save tensors during training of machine learning jobs and analyze those tensors
Apache License 2.0
161 stars 83 forks source link

Use file metadata to determine whether profiler config should be reloaded. #464

Open ndodda-amazon opened 3 years ago

ndodda-amazon commented 3 years ago

Description of changes:

(Unable to reopen #463 so I'm creating a new PR).

For each step, we need to determine if the profiler config JSON has changed, and if so, we should reload the profiler config. Currently, we reload the JSON into memory and physically check whether the file contents have changed in order to determine if the profiler config should be reloaded. However, this may pose problems for performance at scale because we would be loading a JSON object into memory at each step.

This change replaces the above check by inspecting the file metadata for the last modified time. If the last modified time has changed, that means the file has changed and we should reload the profiler config. This is done without loading the JSON into memory (see tests, which verify that the config file is not accessed (read into memory) if the file has not been modified).

Style and formatting:

I have run pre-commit install to ensure that auto-formatting happens with every commit.

Issue number, if available

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

codecov-io commented 3 years ago

Codecov Report

Merging #464 (224ac0e) into master (433348d) will decrease coverage by 9.02%. The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #464      +/-   ##
==========================================
- Coverage   65.62%   56.60%   -9.03%     
==========================================
  Files         172      113      -59     
  Lines       13260    10277    -2983     
==========================================
- Hits         8702     5817    -2885     
+ Misses       4558     4460      -98     
Impacted Files Coverage Δ
smdebug/profiler/profiler_config_parser.py 84.66% <100.00%> (+0.20%) :arrow_up:
smdebug/profiler/utils.py 66.06% <100.00%> (-6.17%) :arrow_down:
smdebug/tensorflow/__init__.py 0.00% <0.00%> (-100.00%) :arrow_down:
smdebug/tensorflow/constants.py 0.00% <0.00%> (-100.00%) :arrow_down:
smdebug/tensorflow/collection.py 0.00% <0.00%> (-95.88%) :arrow_down:
smdebug/tensorflow/session.py 0.00% <0.00%> (-91.83%) :arrow_down:
smdebug/tensorflow/keras.py 0.00% <0.00%> (-89.30%) :arrow_down:
smdebug/tensorflow/tensor_ref.py 0.00% <0.00%> (-88.71%) :arrow_down:
smdebug/tensorflow/utils.py 0.00% <0.00%> (-86.26%) :arrow_down:
smdebug/core/s3_utils.py 20.00% <0.00%> (-80.00%) :arrow_down:
... and 113 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 433348d...224ac0e. Read the comment docs.