emdeh / pdf-document-processor

0 stars 0 forks source link

The Model ID should be set relative to the selected statement type #6

Closed emdeh closed 3 months ago

emdeh commented 4 months ago

The function select_statemnet_type() enables the user to specify which document type they are working with.

https://github.com/emdeh/pdf-document-processor/blob/cb5414a78a2193739ad979a60229cb0f8fb3e90e/src/prep_env.py#L90-L110

The config parameter in the function above is set to the returned value of the load_statement_config() function, which references the YAML file containing the statement field configurations.

https://github.com/emdeh/pdf-document-processor/blob/cb5414a78a2193739ad979a60229cb0f8fb3e90e/src/prep_env.py#L80-L87

The MODEL_ID variable, which is read from the .env file during runtime should also be set relative to the loaded statement field configuration specified by the user via select_statemnet_type() function instead of statically like this:

https://github.com/emdeh/pdf-document-processor/blob/cb5414a78a2193739ad979a60229cb0f8fb3e90e/src/main.py#L20

emdeh commented 3 months ago

Plan to set an .env variable for each document type, and reference it in the yaml file and expand select_statement_type() to set it.

YAML file expanded .env expanded on local env. Need to adjust function.

emdeh commented 3 months ago
  1. Added an env_var variable to each document type in the yaml file, that references individual .env variables that hold the model IDs.
  2. When the user is prompted for the statement type by select_statement_type(), the env_var value is now returned.
  3. The value is passed as an argument to a new function set_model_id(), which takes the value and uses it to load and return the corresponding variable from the .env file.
emdeh commented 3 months ago

Tested working.