big-data-lab-umbc / Reproducible_and_portable_app_in_cloud

A toolkit to deploy, execute, analyze, and reproduce big data analytics automatically in the cloud.
6 stars 6 forks source link

Integrating RPAC toolkit (App.ini, Resource.ini, Personal.ini) with train.py script. #2

Closed SWAP1795 closed 1 year ago

SWAP1795 commented 1 year ago

Here are the steps for creating one more example for RPAC Toolkit using the cloud-phase-prediction tool.

  1. To run the train.py with the required data in the local machine. To make sure the code is running properly, I ran the script locally providing a local training_data path and model_save path.
  2. Once the algorithm runs successfully two output files are generated. They are - "model.pth" and "scaler.pkl"
  3. After this, I installed docker to achieve containerization. I created the docker image using the following command "docker build -t dock1".
  4. After the image is created, I moved all the pre-requisites files to a folder. This folder contains Dockerfile, training_data, train.py, application.ini, personal.ini, and resource.ini files. The Dockerfile contains the command for running the docker image.
  5. After that, I configured the initialization files with the required details.
  6. Note: for the personal.ini file, the cloud credentials field is not updated, as I don't have permission to generate my cloud access key in the AWS management console.
  7. Now, as in the main.py script, there is a calling function available for the initialization files, but for the train.py script, it is missing. Therefore, the initialization files cannot be utilized in this scenario. To make it useful, we need to include code to call the ini files and achieve complete automation.