NHMDenmark / DaSSCo-Integration

This Repo will include integration of dassco storage from northtec
0 stars 0 forks source link

Computerome usage test #64

Open Baeist opened 2 months ago

Baeist commented 2 months ago

Set up a test for 50 tif files/assets to run through the barcode detection module.

First try as individual jobs then as a job array. We want to do this to help us figure out the best way of using computerome resources. It is not immediately obvious which way that fits our needs best. Keeping in mind that the way we will likely automate the process is through a looping job script that calls an api for information needed to acquire an asset and start running the pipeline modules.

Requires adding barcode reading module scripts to computerome.

Requires moving 50 tif files onto computerome.

Requires a looping script to start the individual jobs.

Requires a script that loops through the file folder and adds job to a job array. Probably want to check different job array sizes too.

Requires python environment. Load anaconda module and then upgrade python to 3.10 to make it compatible with everything else we have.

Requires rewriting of barcode reading module to fit computerome(I think we want to cut down on the amount of job we queue). Includes setting up right directory structure for the module.

Measure resources needed. Nodes, processors, type of nodes, memory. Ultimately this can directly be compared with our total usage before and after.

Measure time used. Including time running and waiting in queue and time before waiting in queue.

@bhsi-snm @ThomasAlscher1991

Baeist commented 2 months ago

Prepare Barcode Detection Module: Ensure that the barcode detection module scripts are available on Computerome. If they are not already installed, we'll need to transfer them to Computerome. Make any necessary adjustments to the scripts to ensure compatibility with the Computerome environment and to optimize resource usage.

Transfer TIF Files: Transfer the 50 TIF files onto Computerome. We can use file-zilla to transfer the files to the appropriate directory on Computerome.

Done. They are located in our project under data/fifty

Python Environment Setup: Load the Anaconda module on Computerome to set up the Python environment. Upgrade Python to version 3.10 if necessary to ensure compatibility with the barcode detection module and other dependencies (not sure if barcode needs this but storage sdk does and it would make sense to use same version for everything, default is 2.7 and they have 3.6 available also). module load anaconda3/4.4.0 conda create --name condaenv source activate condaenv

Directory Structure Setup: Set up the appropriate directory structure for the barcode detection module on Computerome to ensure organized storage and access to input/output files.

Looping Script: Write a looping script that iterates through the file folder containing the TIF files and submits jobs accordingly. Ensure that the looping script handles job submission efficiently and optimizes resource usage.

Individual Job Testing: Write a script that submits each TIF file as a separate job to run through the barcode detection module. Monitor resource usage (nodes, processors, memory) for each individual job.

Job Array Testing: Write a script that submits a job array to process all 50 TIF files. Experiment with different job array sizes to optimize resource usage and job completion time.

Resource Measurement: Monitor resource usage (nodes, processors, memory) during individual job testing and job array testing. Record the total time taken for each testing scenario, including job execution time and queue waiting time. Compare resource usage and time measurements with our organization's total usage before and after implementing the barcode detection module to evaluate its impact on resource consumption.

Baeist commented 2 months ago

Created a script that looks in a folder and creates an array for the files there. This can then be matched with the pbs job_arrayid and given as argument to a python script. This allows us to make use of job arrays. Its not very dynamic (exact folder/number of files and setting of array size) but good enough for testing here. It looks much better to not queue a new job each time which gives this request wait time that can be quite long. The array only has that once. The proof of concept script is located in /home/projects/ku_00273/apps/tests/iterate.sh, the easy test data is in the /data dir in /tests.

The job array itself does not get any output or error files. I created a log file for it that contains the combined qstat information for each job in the array ( the info we want to use for comparing rss usage). The jobs just uses the default output/error file system and we would want to change that later i assume.

Baeist commented 1 month ago

We can automate the download of files to computerome through jobs, but we cant upload files actively without using their 2fa.

bhsi-snm commented 1 month ago

Does that mean we need a script running continuously on Computerome that will keep checking if there is a need to trigger the download_files script to get files from the Ingestion server to Computerome?

Baeist commented 1 month ago

I think so yes. In general i think we need to have a master job script that checks by contacting other services which jobs should be queued. This master script will need to either restart itself or reschedule itself once it has finished setting up other jobs.