production tools - Githubissues

drinkingkazu commented 7 years ago

Started production tools development. Some goals:

) file tracking and association to batch jobs ) status tracking of batch jobs ) easy connection from one project and another project, related by files (one's output files = the other's input files) ) Should support condor, minimal dependency

Started a repository here: https://github.com/drinkingkazu/proddb The bare bone uses MySQL database w/o any dependency on larlite/larcv. Purely python libraries and scripts. Some bin scripts contain larcv/larlite specific utilities, not sure the best way to organize w/o adding cumbersome directory structures. Advices welcome.

Current dependency is only MySQL, Python2, and python module MySQLdb.

drinkingkazu commented 7 years ago

Added run_processordb to access files from database https://github.com/LArbys/LArCV/commit/eb4177a0eb88f69ad2fe2ea7cf337835434219dc

drinkingkazu commented 7 years ago

Added job_supera.py, job_merger.py, and job_larcv.py https://github.com/LArbys/LArCV/commit/87f51d46eab16735dc10413cdfa695e6064ce45e

These scripts are meant to be an executable @ batch worker. You can provide job_supera.py or job_larcv.py to proddb/bin/submit_onestream.py. You can use job_merger.py with proddb/bin/submit_twostream.py (actually this needs some update on proddb side). These scripts automatically load input file locations from the database and locally copy files. It also changes input project status once file is consumed, register output file into an output project if configured, etc. etc.

Note run_processordb is totally different from these 3 scripts. run_processordb is for an interactive use and won't modify database values.

LArbys / LArCV

production tools #70