pennsignals / dsdk

Data Science Deploy Kit
MIT License
8 stars 7 forks source link

Update release actions for pypi publication #71

Closed GrayEye closed 2 years ago

GrayEye commented 2 years ago

Working prototype. Have questions on following subjects

  1. Test: Why did it previously link a bunch of dirs?

Nomad containerization adds three file system root directories to every container's linux file system at runtime:

/alloc /secrets /local

Secrets is in-memory only to protect against exposure on the host's (runner's) filesystem. This is where nomad+consul+vault writes secrets files.

Local is for configuration per microservice.

Alloc is shared across all containers that are part of the same job.

... But the docker containerization doesn't do this the same way, so we symlink the relative paths to absolute root paths in the container, so the filesystem looks the same to docker (standalone local) and docker running with nomad's volumes.


  1. What are all the volumes for in Test?

Models reside on the fileshare which is mounted on all worker nodes. This volume needs mounted inside the containers.

Gold file is for ensuring that SQL or code changes didn't accidentally break the model. Does the model produce the same predictions with "old-enough" known AS_OF dates before and after code changes. Gold file is a "gold-master-set" of pickled predictions created by "create-gold" and checked with "validate-gold".


  1. How will secrets be handled for EPIC and POSTGRES?

cfgenvy handles the /secrets/secrets.env file merge with the configuration file /local/configuration.yaml. The "contract" is that the env variable names expected by the configuration file must appear in the /secrets/secrets.env file.

The checked in /local/tests.yaml often simply sets sensitive or epic components to null for the tests so that a dummy /secrets/example.env file can be used.

Closes: #69 Closes: #70 Closes: #12