TACCProjects / tacc_stats

GNU Lesser General Public License v2.1
7 stars 12 forks source link

Data migration from Ranger to staging #7

Open aterrel opened 12 years ago

aterrel commented 12 years ago

Write scripts that grab the archived data from ranger to the staging box on a daily basis.

Jlong591 commented 12 years ago

Script is made, just need to run it on a compute node instead of one of the login nodes. Tried to ssh to one, however it required a password that I didn't have.

aterrel commented 12 years ago

Where do I find this script?

Jlong591 commented 12 years ago

Script is in my home directory on ranger

~/jlong591/work/tacc_stats/monitor/move_jobs.py

grabs 10 jobs at a time, shelves, then moves it to terrel. Had a crazy week with class; going to play catch-up this weekend.

You prefer one job per file right? I was going to write a script to pickle each job and place them in dirs based on timestamps of the day they ran on.

I ran the script this weekend as a job, so the jobs SHOULD all be on terrel under tacc_stats/data-set/

In each timestamp directory is the index of the jobs. Each index holds 10 jobs so each day has timestamp dir has # of dirs = #jobs that day/ 10 + 1

Let me know if you want me to keep them in shelves, changed how they are moved, or whatever else. I sent the jobs 10 at a time to make sure tmp wasn't overloaded.

Justin Long (832)865-7199 Biomedical Engineering The University of Texas at Austin

On Wed, Feb 15, 2012 at 1:39 PM, Andy R. Terrel < reply@reply.github.com

wrote:

Where do I find this script?


Reply to this email directly or view it on GitHub: https://github.com/TACCProjects/tacc_stats/issues/7#issuecomment-3987331

aterrel commented 12 years ago

Okay I'll look over it and get back with you on the questions.

aterrel commented 12 years ago

Okay I tarred up what was on terrel and put it on tacc-stats:~tacc-stats/data-set. Your job seems to have died and only 7 directories had any data in them.

Jlong591 commented 12 years ago

Alright. Ill take a look at it. Rhsnkd

Sent from my iPhone

On Feb 15, 2012, at 2:45 PM, "Andy R. Terrel"reply@reply.github.com wrote:

Okay I tarred up what was on terrel and put it on tacc-stats:~tacc-stats/data-set. Your job seems to have died and only 7 directories had any data in them.


Reply to this email directly or view it on GitHub: https://github.com/TACCProjects/tacc_stats/issues/7#issuecomment-3988688

Jlong591 commented 12 years ago

Pushed new move_jobs.py to my branch. you can find it at https://github.com/Jlong591/tacc_stats. Changed things based on the comments Andy left me. Also pushed job file.

Jlong591 commented 12 years ago

Finally found an error with the job migration: This is from the traceback, and it seems to be an error in job.py

Traceback (most recent call last): File "move_jobs_test.py", line 108, in main() File "move_jobs_test.py", line 46, in main moveJobs(time) File "move_jobs_test.py", line 84, in moveJobs j=job.from_acct(acct) File "/share/home/01902/jlong591/work/tacc_stats/monitor/job.py", line 451, in from_acct job.gather_stats() and job.munge_times() and job.process_stats() File "/share/home/01902/jlong591/work/tacc_stats/monitor/job.py", line 305, in gather_stats self.error("no host list found\n", path) File "/share/home/01902/jlong591/work/tacc_stats/monitor/job.py", line 300, in error error('%s: ' + fmt, self.id, *args) File "/share/home/01902/jlong591/work/tacc_stats/monitor/job.py", line 17, in error msg = fmt % args TypeError: not all arguments converted during string formatting

This was found when I was trying to move 10 jobs for yet another test run. I think it has to do with the host list? I'm going to check other days and see if I get something similar. The day I ran was 1309496400.

Also, this happened at the start:

ln: creating symbolic link /tmp/TS/stats/archive' to/scratch/projects/tacc_stats/archive': Permission denied ln: creating symbolic link /tmp/TS/prolog_host_lists/hostfiles' to/scratch/projects/tacc_stats/hostfiles': Permission denied

but when I check the dir where the links are being made they show up.

I'll post if anything new develops

Jlong591 commented 12 years ago

Just tried the same test on another day and it worked fine. I think I can make a work around for the error in my script now that I know the error. Should I make changes to job.py to handle this error better?

Jlong591 commented 12 years ago

Never mind, got it. New commit coming soon.

Jlong591 commented 12 years ago

updateJobs.py is pushed. Will update the database with the .tar.gz files in /home/jlong591/job-data on tacc-stats. I had it in my tacc_stats_web/scripts folder. There's some information given at the top of the script (similar to Andy's build_database.py) about exporting DJANGO_SETTINGS_MODULE and PYTHONPATH. This script takes in a time option (-t --time) [note, epoch time stamp], untars that data, updates, then pushes the tar file to ranch.