GSTT-CSC / MLOps

Framework for building ML apps
GNU General Public License v3.0
9 stars 5 forks source link

.env file contains default values #17

Closed laurencejackson closed 3 years ago

laurencejackson commented 3 years ago

Need to remove default values to encourage users to provide their own, also a security risk if these are public. Documentation will need to be improved to help users fill these values.

VirenderNayal commented 3 years ago

Hi, I can help with this issue. Can you guide me as I am new to this repository. Thank you.

laurencejackson commented 3 years ago

Hi VirenderNayal, thanks for getting in touch. I'm happy for you to address this issue.

The readme file should give you a good overview of the project - if you have any questions then let me know so I can update the readme! The readme also has a section 'contributing' which will describe how to contribute to the project. Essentially you will need to clone the repository, create your feature branch, make your changes and open a pull request.

The issue we have here is that we have default values stored in the /mlflow_server/.env file. Having default values encourages users to not set the variables locally and can be a security risk when these variables are public.

The solution to this will be to remove all default values, but keep the variables names and assignment operator. Having no values here means that starting the server will crash at startup. So you will also need to update the documentation (readme) and optionally include some error handling mechanism to detect when the user has not set the required environment variables.

So to summarise, the tasks to resolve this issue are:

  1. Delete default variables in /mlflow_server/.env
  2. Update documentation (readme/getting started) to inform user to set their own values
  3. (optional) create system for catching or detecting when user has not set their own values in .env and inform them to do so

Many thanks!

Please let me know if you have any questions.

VirenderNayal commented 3 years ago

Thank you for assigning this issue to me. I am working on this and will create a Pull Request as soon as possible.

I will let you know if I have any questions.

VirenderNayal commented 3 years ago

Here are the purposed changes :

Changes in ./MLOps/mlflow_server/.env

# Example env file - fill all required values before using
MYSQL_DATABASE=mlflow_db
MYSQL_USER=mlflow
MYSQL_PASSWORD=strongpassword1
MYSQL_ROOT_PASSWORD=strongpassword1root?

void_MLFLOW_S3_ENDPOINT_URL=http://127.0.0.1:9002
MLFLOW_S3_ENDPOINT_URL=http://s3:9002
AWS_ACCESS_KEY_ID=
AWS_SECRET_ACCESS_KEY=
MINIO_ROOT_USER=
MINIO_ROOT_PASSWORD=

MLFLOW_S3_IGNORE_TLS=true

Changes in ./MLOps/README.md

### Installation

1. ...

2. Navigate to the cloned code repository and change environmental variables.They are compulsory to be set otherwise the server will crash.

`
AWS_ACCESS_KEY_ID=anyKeyId
AWS_SECRET_ACCESS_KEY=anySecretAccessKey
MINIO_ROOT_USER=minioUsername
MINIO_ROOT_PASSWORD=minioPassword
`

3. ...

Are these changes satisfies the requirements or anything to be modified.

Should I create the Pull Request with these changes?

Thank You.

laurencejackson commented 3 years ago

Thanks, this looks great! I would also suggest removing the MYSQL_USER, MYSQL_PASSWORD, and MYSQL_ROOT_PASSWORD defaults and adding them to the readme codeblock.

We might also need to add a bit more the README.md to explain that the text in the codeblock shows example values and the user should not use these examples.

Thanks for doing this, please can you open a pull request to merge into the develop branch and we can review/suggest changes in more detail.

VirenderNayal commented 3 years ago

@laurencejackson I have created a pull request #21 as required. Have a look and suggest changes, if any.

Thanks.