NVIDIA / DeepLearningExamples

State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
13.42k stars 3.21k forks source link

[SE(3)-Transformer/All] Minor issue in default parameter for logging.: Change default from "/results" to "./results". #1253

Open RMichae1 opened 1 year ago

RMichae1 commented 1 year ago

Related to SE(3)-Transformer/All

Describe the bug The default log_dir argument is set to "/results" instead of referencing the current/working directory "./results", which fails when running the scripts/train.sh with the provided default arguments. This results in PermissionError: [Errno 13] Permission denied: '/results' (- when working on a Linux system). The root-cause can be found in the runtime/arguments.py default argument: https://github.com/NVIDIA/DeepLearningExamples/blob/35d8759cb8cf52f8c7d33900ef27fd0f16d6cff3/DGLPyTorch/DrugDiscovery/SE3Transformer/se3_transformer/runtime/arguments.py#L36

To Reproduce Steps to reproduce the behavior:

  1. Setup of SE(3)-Transformer either via provided Dockerfile or reproducing the environment as specified .
  2. Run 'bash ./scripts/train.sh' with default arguments

Expected behavior The script train.sh should not terminate with an error and should log output to the default specified directory that is not "/results", but on relative path of execution "./results" .

Environment Please provide at least:

milesial commented 1 year ago

Hi, I assume this only appears when running outside of docker right? When using the recommended docker image, this shouldn't appear since the docker user is root. I'll get this changed anyway if it makes life easier for non-docker users.

RMichae1 commented 1 year ago

Hi, You're correct! This was run outside of Docker. Thank you for addressing it.

milesial commented 1 year ago

Fix landed in https://github.com/NVIDIA/DeepLearningExamples/commit/2586ee38a7c3ec74cb7d565558e72175334511c7