UOB-AI / UOB-AI.github.io

A repository to host our documentations website.
https://UOB-AI.github.io
1 stars 3 forks source link

Trying to start interactive session on open demand it is immediately moving from starting to complete with in seconds. #19

Closed AbdulKhaliq293 closed 2 months ago

AbdulKhaliq293 commented 1 year ago

int_openDemand

asubah commented 1 year ago

@AbdulKhaliq293 Are there any error messages that appear to you? If you click on the session ID it will redirect you to the file explorer, can you post the contents of the output.log file please.

AbdulKhaliq293 commented 1 year ago

Here is the content of output.log file

============================
==========================================
SLURM_CLUSTER_NAME = linux
SLURM_ARRAY_JOB_ID = 
SLURM_ARRAY_TASK_ID = 
SLURM_ARRAY_TASK_COUNT = 
SLURM_ARRAY_TASK_MAX = 
SLURM_ARRAY_TASK_MIN = 
SLURM_JOB_ACCOUNT = ugstudents
SLURM_JOB_ID = 6936
SLURM_JOB_NAME = sys/dashboard/sys/jupyter
SLURM_JOB_NODELIST = cn[01-02]
SLURM_JOB_USER = nobody
SLURM_JOB_UID = 1220
SLURM_JOB_PARTITION = standard
SLURM_TASK_PID = 266432
SLURM_SUBMIT_DIR = /var/www/ood/apps/sys/dashboard
SLURM_CPUS_ON_NODE = 24
SLURM_NTASKS = 
SLURM_TASK_PID = 266432
==========================================
Script starting...
Waiting for Jupyter Notebook server to open port 11758...
TIMING - Starting wait at: Tue Apr 25 13:32:20 +03 2023
==> Error: git arch=linux-centos8-cascadelake matches multiple packages.
  Matching packages:
    ulmn6ga git@2.36.1%gcc@11.3.0 arch=linux-centos8-cascadelake
    c7x2zma git@2.36.1%gcc@12.1.0 arch=linux-centos8-cascadelake
  Use a more specific spec.
+ jupyter-lab --config=/home/nfs/20188856/ondemand/data/sys/dashboard/batch_connect/sys/jupyter/output/3cca7416-00c5-463d-ac77-c525905512aa/config.py
[W 2023-04-25 13:32:32.989 LabApp] Config option `kernel_spec_manager_class` not recognized by `LabApp`.
[W 2023-04-25 13:32:32.990 LabApp] 'ip' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2023-04-25 13:32:32.990 LabApp] 'port' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2023-04-25 13:32:32.990 LabApp] 'port_retries' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2023-04-25 13:32:32.990 LabApp] 'password' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2023-04-25 13:32:32.990 LabApp] 'base_url' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2023-04-25 13:32:32.990 LabApp] 'base_url' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2023-04-25 13:32:32.990 LabApp] 'allow_origin' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2023-04-25 13:32:32.990 LabApp] 'notebook_dir' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2023-04-25 13:32:32.990 LabApp] 'disable_check_xsrf' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2023-04-25 13:32:32.991 LabApp] Config option `kernel_spec_manager_class` not recognized by `LabApp`.
[W 2023-04-25 13:32:32.993 LabApp] Config option `kernel_spec_manager_class` not recognized by `LabApp`.
[W 2023-04-25 13:32:32.994 ServerApp] notebook_dir is deprecated, use root_dir
[I 2023-04-25 13:32:32.994 ServerApp] jupyterlab | extension was successfully linked.
[W 2023-04-25 13:32:32.995 NotebookApp] Config option `kernel_spec_manager_class` not recognized by `NotebookApp`.
[W 2023-04-25 13:32:32.996 NotebookApp] Config option `kernel_spec_manager_class` not recognized by `NotebookApp`.
[W 2023-04-25 13:32:32.998 NotebookApp] Config option `kernel_spec_manager_class` not recognized by `NotebookApp`.
[I 2023-04-25 13:32:32.998 ServerApp] nbclassic | extension was successfully linked.
[I 2023-04-25 13:32:33.574 ServerApp] notebook_shim | extension was successfully linked.
[I 2023-04-25 13:32:35.289 ServerApp] [nb_conda_kernels] enabled, 11 kernels found
[W 2023-04-25 13:32:35.348 ServerApp] WARNING: The Jupyter server is listening on all IP addresses and not using encryption. This is not recommended.
[I 2023-04-25 13:32:35.354 ServerApp] notebook_shim | extension was successfully loaded.
[I 2023-04-25 13:32:35.355 LabApp] JupyterLab extension loaded from /data/software/miniconda3/lib/python3.9/site-packages/jupyterlab
[I 2023-04-25 13:32:35.355 LabApp] JupyterLab application directory is /data/software/miniconda3/share/jupyter/lab
[I 2023-04-25 13:32:35.359 ServerApp] jupyterlab | extension was successfully loaded.
[I 2023-04-25 13:32:35.373 ServerApp] nbclassic | extension was successfully loaded.
[I 2023-04-25 13:32:35.374 ServerApp] Serving notebooks from local directory: /home/nfs/201888565
[I 2023-04-25 13:32:35.374 ServerApp] Jupyter Server 1.23.4 is running at:
[I 2023-04-25 13:32:35.374 ServerApp] http://localhost:11758/node/cn01/11758/lab
[I 2023-04-25 13:32:35.374 ServerApp]  or http://127.0.0.1:11758/node/cn01/11758/lab
[I 2023-04-25 13:32:35.374 ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[E 2023-04-25 13:32:35.377 ServerApp] Failed to write server-info to /home/nfs/201888565/.local/share/jupyter/runtime/jpserver-266675.json: [Errno 28] No space left on device: '/home/nfs/201888565/.local/share/jupyter/runtime/jpserver-266675.json'

  _   _          _      _
 | | | |_ __  __| |__ _| |_ ___
 | |_| | '_ \/ _` / _` |  _/ -_)
  \___/| .__/\__,_\__,_|\__\___|
       |_|

Read the migration plan to Notebook 7 to learn about the new features and the actions to take if you are using extensions.

https://jupyter-notebook.readthedocs.io/en/latest/migrate_to_notebook7.html

Please note that updating to Notebook 7 might break some of your extensions.

Traceback (most recent call last):
  File "/data/software/miniconda3/bin/jupyter-lab", line 11, in <module>
    sys.exit(main())
  File "/data/software/miniconda3/lib/python3.9/site-packages/jupyter_server/extension/application.py", line 594, in launch_instance
    serverapp.start()
  File "/data/software/miniconda3/lib/python3.9/site-packages/jupyter_server/serverapp.py", line 2817, in start
    self.start_app()
  File "/data/software/miniconda3/lib/python3.9/site-packages/jupyter_server/serverapp.py", line 2746, in start_app
    self.write_browser_open_files()
  File "/data/software/miniconda3/lib/python3.9/site-packages/jupyter_server/serverapp.py", line 2618, in write_browser_open_files
    self.write_browser_open_file()
  File "/data/software/miniconda3/lib/python3.9/site-packages/jupyter_server/serverapp.py", line 2641, in write_browser_open_file
    with open(self.browser_open_file, "w", encoding="utf-8") as f:
OSError: [Errno 28] No space left on device: '/home/nfs/201888565/.local/share/jupyter/runtime/jpserver-266675-open.html'
Timed out waiting for Jupyter Notebook server to open port 11758!
TIMING - Wait ended at: Tue Apr 25 13:33:22 +03 2023
Cleaning up..
asubah commented 1 year ago
OSError: [Errno 28] No space left on device: '/home/nfs/201888565/.local/share/jupyter/runtime/jpserver-266675-open.html'

The last few lines of the log file shows that you are out of space. Please make sure that you didn't install large packages, and make sure to use the /data/datasets directory for data and to save your models.

Refer to issue #14 for more info about how to solve this problem.

AbdulKhaliq293 commented 1 year ago

The issues remain persistent it showing the reason i consider is i have been saving callback model in my home directory which consumed all the space here are some shots from the #14 commands suggested

git_od_issue1 git_2

asubah commented 1 year ago

As I said in my last comment, you can move your model to /data/dataset and your issue will be solved.

AbdulKhaliq293 commented 1 year ago

how i can access that model and delete the files? data_file As you can see there plenty space but i am not sure where is 4gb model gone?

asubah commented 1 year ago

OK, there is something weird with your account! Why is your username different from your home directory name?!!!

asubah commented 1 year ago

OK, I just checked and there is indeed something wrong with your account. We will fix it and get back to you.

AbdulKhaliq293 commented 1 year ago

this how my dir look like home_dir

asubah commented 1 year ago

Due to some issue from our side, you have two home directories with the same UID! So when you try to log in to JupyterLab you get a different home directory than when you login from the shell. Until we fix this issue you can check the other directory by cd /home/nfs/201888565. And do df -kf . or du -hd1 . and you will see that it is full. And there is a ~5GB model file there. For now please move that file to make JupyterWork until we decide how are we going to solve this issue.

AbdulKhaliq293 commented 1 year ago

it is working as i have delete the model!