Open Costaki33 opened 2 years ago
Notes:
connect() is set up, establishes the connection, and returns it. Each function passes the connection around, creating cursors on the fly and closing them in their respective functions.
Indexes have been created for each of the columns. JobID is now read in as a string. I decided to leave State as a varchar, as a simple search for the state definition (COMPLETED, FAILED, etc.) would be easiest. I could not index the Nodelist column, as it is a TEXT type. MySQL can only index the first few N values of TEXT. It can't guarantee its uniqueness because the size keeps changing. I decide to leave it, with no indexing -> can easily search for a specific node.
Properly converted the datetime columns from their local time (Still unsure - waiting on Joe to reply) to UTC time. Also created if statement cases for different SLURM time defined cases for the Timelimit column (Working) and converted all time in time-limit column to minutes (max-minutes)
Data all properly inserted into table and can query properly in MySQL
UPDATE:
Add Debug statements to see how long the data is running
Point to a directory and load into a database all the data from the database all the files it hasn't run yet. Admins -> We move the new files to a new directory and read those in. No archive. Grab the new ones, read in, and delete.
Christian's idea: List of all the names of the HPCS and directory path, SSH client into each hpc through a loop, grab file or scp via list based on position in comparison to the position of the last read in file
Initial load:
For each update, above
Contact Virginia
[x] isConnected()
[x] Need to implement complete list of indexes
[x] Create cursors on the fly, pass in the connection, create cursor, close cursor in the respective function
[x] Injection
[ ] Injection Improvements
[ ] On the command line