justanhduc / task-spooler

A scheduler for GPU/CPU tasks
https://justanhduc.github.io/2021/02/03/Task-Spooler.html
GNU General Public License v2.0
273 stars 24 forks source link

Cpu only for multi-user version #28

Open kylincaster opened 1 year ago

kylincaster commented 1 year ago

For multi-user, each user has the same opportunity to invoke a new job, if the user's slot and the total slot are large enough.

usage:

  1. the task-spooler server can only be run by the root
  2. socket file is created at ./${tmpdir}/socket-ts.root or which could be specific by TS_SOCKET environment variable.(server_start.c)
  3. the default user file is specifiec in user.c which could be modified by the enivorment variable TS_USER_PATH. Moreover a log file is also controlled by user.c
  4. format of user file (Max 100 users):
    # 1231 # comments
    TS_SLOTS = 4 # Set the total TS_SLOTS in task-spooler
    # TS_FIRST_JOBID = 2000 # Set the index of the first job in task-spooler
    # uid     name    slots
    1000      Kylin   10
    3021     test1    10
    1001     test0    100
    34        user2    30

New features/Commands and the potential problem

  1. --daemon Run the server as daemon by Root only.
  2. --hold and --restart [jobid] hold-on and restart a task.
  3. --lock and --unlock Lock and unlock the task-spooler servers to avoid the potential conflict
  4. --stop and --cont [user], pause and continue all tasks, or lock/unlock all user by root
  5. -A show all user information and all tasks
  6. -X refresh the user configure on-the-fly
  7. -K kill the task spooler server
  8. -r remove a job, even it is running

The main problem of my work is that the root server cannot control the task run by the other normal user. I found in my service I cannot stop/pause the task owner by the other normal user. Could you have a look on the c_remove_job() function in the client.c

justanhduc commented 1 year ago

Very impressive! I will try to have a look asap.

sadikyalcin commented 1 year ago

What the status with this @justanhduc? I'm calling tsp via a PHP file with shell_exec. I have issues with my process and there is absolutely no way of debugging it.

justanhduc commented 1 year ago

Hey @sadikyalcin. I'm quite busy this semester, so I can't commit any time for this until the end of the year. What issue did you encounter? How about opening an issue? And why do you think this PR can solve your problem?

sadikyalcin commented 1 year ago

Hey @sadikyalcin. I'm quite busy this semester, so I can't commit any time for this until the end of the year. What issue did you encounter? How about opening an issue? And why do you think this PR can solve your problem?

I'm not even sure if I'm in the right repo but I'm using this https://manpages.ubuntu.com/manpages/xenial/man1/tsp.1.html

Issue is, I'm calling tsp via exec from a php file, which happens to be run by apache / www-data. I have no way of debugging or monitoring the queue when I ssh via another user or root since I cannot see the tasks created by the web server.

My understanding was this PR would allow us to define which user can run tsp - so I could just ssh and monitor the tasks when ever I wished.