mila-iqia / milatools

Tools to connect to and interact with the Mila cluster
MIT License
60 stars 11 forks source link

[v0.1.2] `mila code` with `--persist` fails! #100

Closed lebrice closed 6 months ago

lebrice commented 7 months ago

What command did you run?

mila code repos/office_hours --persist

Describe the bug

$ mila code repos/office_hours --persist
(mila) $ lfs quota -u $USER $HOME
Disk quotas for usr normandf (uid 1471600598):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
/home/mila/n/normandf
                95748040       0 104857600       -  908726       0 1048576       -
uid 1471600598 is using default block quota setting
uid 1471600598 is using default file quota setting
[02/13/24 15:34:18] WARNING  2024-02-13 15:34:18,136 - WARNING - Unable to check the disk-quota on the cluster: not enough   commands.py:534
                             values to unpack (expected 9, got 1)                                                                           
sbatch: error: Unable to open file ~/.milatools/batch/batch-1707856458144355897.sh
touch: cannot touch '.milatools/batch/out-1707856458144355897.txt': No such file or directory
tail: cannot open '.milatools/batch/out-1707856458144355897.txt' for reading: No such file or directory
tail: no files remaining
Traceback (most recent call last):
  File "/home/fabrice/miniconda3/lib/python3.11/site-packages/milatools/cli/commands.py", line 80, in main
    mila()
  File "/home/fabrice/miniconda3/lib/python3.11/site-packages/milatools/cli/commands.py", line 383, in mila
    return function(**args_dict)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/fabrice/miniconda3/lib/python3.11/site-packages/milatools/cli/commands.py", line 572, in code
    data, proc = cnode.ensure_allocation()
                 ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/fabrice/miniconda3/lib/python3.11/site-packages/milatools/cli/remote.py", line 495, in ensure_allocation
    login_node_runner, results = self.extract(
                                 ^^^^^^^^^^^^^
  File "/home/fabrice/miniconda3/lib/python3.11/site-packages/milatools/cli/remote.py", line 364, in extract
    promise.join()
  File "/home/fabrice/miniconda3/lib/python3.11/site-packages/invoke/runners.py", line 1622, in join
    return self.runner._finish()
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/fabrice/miniconda3/lib/python3.11/site-packages/invoke/runners.py", line 518, in _finish
    raise UnexpectedExit(result)
invoke.exceptions.UnexpectedExit: Encountered a bad command exit code!

Command: "cd $SCRATCH && sbatch -J mila-code '~/.milatools/batch/batch-1707856458144355897.sh'; touch .milatools/batch/out-1707856458144355897.txt; tail -n +1 -f .milatools/batch/out-1707856458144355897.txt"

Exit code: 1

Stdout: already printed

Stderr: n/a (PTYs have no stderr)

An error occurred during the execution of the command `code`. Please try updating milatools by running
  pip install milatools --upgrade
in the terminal. If the issue persists, consider filling a bug report at
  https://github.com/mila-iqia/milatools/issues/new?labels=code%2C0.0.18&template=bug_report.md&title=%5Bv0.0.18%5D+Issue+running+the+command+%60mila+code%60
Please provide the error traceback with the report (the red text above).

Desktop (please complete the following information):