microsoft / pai

Resource scheduling and cluster management for AI
https://openpai.readthedocs.io
MIT License
2.61k stars 545 forks source link

Provide guideline about howto write job `cmd` #1897

Open hao1939 opened 5 years ago

hao1939 commented 5 years ago

Provide some guideline about howto write job cmd. Will need to investigate and figure out the best solutions, enhance and document it

User summarize some notes about job cmd:

  1. When submit the PAI job, if u want to run a shell script in command field, use “source filename.sh” not “./filename.sh” to run it. Or it will not get the environment variable of PAI.
  2. If U write the shell script in windows than Use the “dos2unix filename” to convert the shell script in jumpbox machine, or U may be will encounter “$'\r': command not found” error.
  3. When run a python and have a Flag is a config file path. If u encounter “not found file” error, try to use relative path instead of absolute path.
xudifsd commented 5 years ago

You mean a doc of FAQ about cmd?

hao1939 commented 5 years ago

This issue is feedback from user, we will need to discuss and break it into detailed items.

For my experience, the most confusion part is 'escaping' and 'substitution'. We can reduce the complexity on the cost of flexibility: for example only allow user to input literal command, and only allow 'substitution' the 'job envs'.

We have to do some trade off between flexibility and simplify.

scarlett2018 commented 5 years ago

Adding @qfyin for reference as well.

scarlett2018 commented 3 years ago

cc @mydmdm - another backlog for samples, tutorial, and doc that Guowei can help later.