Separate training and evaluation

Because

When training a model the evaluation is now closely embedded in the train_model() method:

https://github.com/clamsproject/app-swt-detection/blob/29b0ca91dad07b0f58682a4b6a874ed3c556e413/modeling/train.py#L308-L317

This makes sense because the data are available at that time.

It does however create a strong coupling between training and evaluation and also creates some redundancies. For example, post-binning does not impact training, so you could imaging having several post-binning strategies tested on the same model created after pre-binning, and it that case we would have to repeatedly built the model, even though it is identical.

It would be good to have a separate evaluation module that gets called from here and that can also be used for standalone evaluation. The post-binning could also be relegated to a utility module.

Done when

No response

Additional context

No response

clamsproject / app-swt-detection