Configurations issues - Githubissues

marcverhagen commented 11 months ago

Because

Issue to keep track of configuration decisions and questions.

We decided to have two configuration files: one with settings for the model and one with run-time parameters.

The example configuration file modeling/config/classifier-full.yaml has both of them (for now), including some settings for binning:

prebin: 
  - bars
  - slate
  - chyron
  - text-not-chyron
  - person-not-chyron
  - credit
  - other
labels: [ "slate", "chyron", "credit", "other" ]
postbin:
  0: 3
  1: 0
  2: 1
  3: 3
  4: 3
  5: 2
  6: 3

Currently, the train code just generates (this is for a different model, but the idea should be clear):

bins: {'pre': {'slate': ['S'], 'chyron': ['I', 'N', 'Y'], 'credit': ['C']}, 'post': {'sladit': ['slate', 'credit'], 'chyron': ['chyron']}}
labels: ['sladit', 'chyron', 'other']

The question is which format we prefer for the config file. I would prefer something like

labels:
  - sladit
  - chyron
  - other
bins:
  - sladit
    - slate
    - credit
  - chryon
    - chyron
  - other

Or even

labels:
  - sladit
    - slate
    - credit
  - chryon
    - chyron
  - other

It is compact yet has all the information needed, I might be wrong about the latter though.

I am also somewhat uncomfortable with the "other" label.

Done when

No response

Additional context

No response

marcverhagen commented 11 months ago

One change we agreed on was to do a file copy of the trainer configuration rather then dumping the loaded configurations as a Yaml file since the latter does not preserve comments.

keighrim commented 10 months ago

The original issue was resolved by 39d859f34a56e0a8499a5a716e904747adc8f029...570ca0c764ec1c0b7249fab6acff27a1548111cf.

clamsproject / app-swt-detection

Configurations issues #39

Because

Done when

Additional context