issues
search
awslabs
/
sagemaker-debugger
Amazon SageMaker Debugger provides functionality to save tensors during training of machine learning jobs and analyze those tensors
Apache License 2.0
161
stars
83
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Attempting to fix the flaky test for multiprocessing
#410
leleamol
closed
3 years ago
1
Adding support for pytorch 1.7
#409
leleamol
closed
3 years ago
3
Regex match bug
#408
NihalHarish
closed
3 years ago
1
Introducing the profiling functionality.
#407
leleamol
closed
3 years ago
2
Unblock the debugger CI
#406
NihalHarish
closed
3 years ago
2
Remove logging statement in Keras hook
#405
ndodda-amazon
closed
3 years ago
3
State store crash fix
#404
Vikas-kum
closed
3 years ago
1
Disable TF2 Tests Explicitly For Vanilla Container
#403
NihalHarish
closed
3 years ago
0
Add Profiler Env
#402
NihalHarish
closed
3 years ago
1
Turn off debugger hooks in PyTorch?
#401
austinmw
opened
4 years ago
4
Update Assert To Account For Saved Layers
#400
NihalHarish
closed
4 years ago
0
pin tensorflow dataset in test config
#399
Vikas-kum
closed
4 years ago
0
smdebug causes an OperatorNotAllowedInGraphError inside a function decorated with tf.function
#398
horietakehiro
closed
3 years ago
1
Extend Logs to report time and memory usage
#397
NihalHarish
opened
4 years ago
0
Incrementing the version to 0.9.5
#396
leleamol
closed
4 years ago
1
extend zcc to 2.1.2 (#384)
#395
NihalHarish
opened
4 years ago
0
Revert Pytorch changes for 0.9.4
#394
NihalHarish
closed
4 years ago
0
[WIP]Use get_name in forward hook
#393
NihalHarish
opened
4 years ago
1
FileNotFoundError when using SageMaker Debugger with PyTorch Distributed Training on SageMaker
#392
piyushghai
opened
4 years ago
2
Removed the redundant installation of smdebug and smdebug-rules
#391
leleamol
closed
4 years ago
1
Revert Commits That Disable Pytorch Tests For 1.7
#390
NihalHarish
opened
4 years ago
1
[Do Not Merge] Temporarily adding logic to create sagemaker client with endpoint url
#389
leleamol
opened
4 years ago
3
[WIP]Test Custom Loss Module
#388
NihalHarish
opened
4 years ago
1
Disable ZCC Tests for Pytorch 1.7
#387
NihalHarish
closed
4 years ago
0
Disable ZCC Tests for Pytorch 1.7
#386
NihalHarish
closed
4 years ago
1
0.9.4 extend zcc
#385
NihalHarish
closed
4 years ago
0
Extend Gradtape ZCC to 2.1.2
#384
NihalHarish
closed
4 years ago
1
pytorch tmp (#382)
#383
NihalHarish
closed
4 years ago
0
Temporary Change To The Pytorch ZCC Tests
#382
NihalHarish
closed
4 years ago
1
Modify Asserts to Work with TF 2.1.0 and TF 2.0.0
#381
NihalHarish
closed
4 years ago
0
Modify Asserts to Work with TF 2.1.0 and TF 2.0.0
#380
NihalHarish
closed
4 years ago
1
Cmiyoung debugger doc
#379
mchoi8739
closed
3 years ago
1
Add support for mixed precision training
#378
NihalHarish
closed
4 years ago
1
Save Nested Layers For Rubik
#377
NihalHarish
opened
4 years ago
1
returning list instead of dict keys
#376
Vikas-kum
closed
4 years ago
1
Save Nested Layers
#375
NihalHarish
closed
4 years ago
5
Fixing the nightly build pipelines. Avoid force reinstall of rules package when not necessary
#374
leleamol
closed
4 years ago
2
Pinning the version of tensorflow_datasets package so that it does not require updating TF
#373
leleamol
closed
4 years ago
3
Bugfix: Debugger breaks if should_save_tensor is called before collections are prepared
#372
NihalHarish
closed
4 years ago
2
Update README.md
#371
NihalHarish
closed
4 years ago
1
[draft] remove force reinstall
#370
NihalHarish
closed
4 years ago
2
Unhandled Exception while training with PyTorch on SageMaker
#369
shiftan
opened
4 years ago
0
Fix Flaky Pytorch Multiprocessing Test
#368
NihalHarish
opened
4 years ago
2
Test Concat Layers
#367
NihalHarish
closed
4 years ago
1
Pass Variable Length Argument To Old Function Call
#366
NihalHarish
closed
4 years ago
1
Add selective assert for < tf2.2.0
#365
NihalHarish
closed
4 years ago
0
check file exist before moving
#364
Vikas-kum
closed
4 years ago
1
Error in atexit
#363
Vikas-kum
closed
3 years ago
0
Link to 'SageMaker Debugger's Hook' is broken in the readme
#362
jamesleoni
opened
4 years ago
0
tensorflow_datasets failed to load dataset with data_dir="s3://<sagemaker-bucket>" in sagemaker notebook instance
#361
komushi
opened
4 years ago
1
Previous
Next