Open aterrel opened 12 years ago
Script is made, just need to run it on a compute node instead of one of the login nodes. Tried to ssh to one, however it required a password that I didn't have.
Where do I find this script?
Script is in my home directory on ranger
~/jlong591/work/tacc_stats/monitor/move_jobs.py
grabs 10 jobs at a time, shelves, then moves it to terrel. Had a crazy week with class; going to play catch-up this weekend.
You prefer one job per file right? I was going to write a script to pickle each job and place them in dirs based on timestamps of the day they ran on.
I ran the script this weekend as a job, so the jobs SHOULD all be on terrel
under tacc_stats/data-set/
In each timestamp directory is the index of the jobs. Each index holds 10 jobs so each day has timestamp dir has # of dirs = #jobs that day/ 10 + 1
Let me know if you want me to keep them in shelves, changed how they are moved, or whatever else. I sent the jobs 10 at a time to make sure tmp wasn't overloaded.
Justin Long (832)865-7199 Biomedical Engineering The University of Texas at Austin
On Wed, Feb 15, 2012 at 1:39 PM, Andy R. Terrel < reply@reply.github.com
wrote:
Where do I find this script?
Reply to this email directly or view it on GitHub: https://github.com/TACCProjects/tacc_stats/issues/7#issuecomment-3987331
Okay I'll look over it and get back with you on the questions.
Okay I tarred up what was on terrel and put it on tacc-stats:~tacc-stats/data-set. Your job seems to have died and only 7 directories had any data in them.
Alright. Ill take a look at it. Rhsnkd
Sent from my iPhone
On Feb 15, 2012, at 2:45 PM, "Andy R. Terrel"reply@reply.github.com wrote:
Okay I tarred up what was on terrel and put it on tacc-stats:~tacc-stats/data-set. Your job seems to have died and only 7 directories had any data in them.
Reply to this email directly or view it on GitHub: https://github.com/TACCProjects/tacc_stats/issues/7#issuecomment-3988688
Pushed new move_jobs.py to my branch. you can find it at https://github.com/Jlong591/tacc_stats. Changed things based on the comments Andy left me. Also pushed job file.
Finally found an error with the job migration: This is from the traceback, and it seems to be an error in job.py
Traceback (most recent call last):
File "move_jobs_test.py", line 108, in
This was found when I was trying to move 10 jobs for yet another test run. I think it has to do with the host list? I'm going to check other days and see if I get something similar. The day I ran was 1309496400.
Also, this happened at the start:
ln: creating symbolic link /tmp/TS/stats/archive' to
/scratch/projects/tacc_stats/archive': Permission denied
ln: creating symbolic link /tmp/TS/prolog_host_lists/hostfiles' to
/scratch/projects/tacc_stats/hostfiles': Permission denied
but when I check the dir where the links are being made they show up.
I'll post if anything new develops
Just tried the same test on another day and it worked fine. I think I can make a work around for the error in my script now that I know the error. Should I make changes to job.py to handle this error better?
Never mind, got it. New commit coming soon.
updateJobs.py is pushed. Will update the database with the .tar.gz files in /home/jlong591/job-data on tacc-stats. I had it in my tacc_stats_web/scripts folder. There's some information given at the top of the script (similar to Andy's build_database.py) about exporting DJANGO_SETTINGS_MODULE and PYTHONPATH. This script takes in a time option (-t --time) [note, epoch time stamp], untars that data, updates, then pushes the tar file to ranch.
Write scripts that grab the archived data from ranger to the staging box on a daily basis.