angelolab / Nimbus

Other
12 stars 1 forks source link

Write script for model validation and metric calculation to automatically generate plots and pred cell_table #45

Closed JLrumberger closed 1 year ago

JLrumberger commented 1 year ago

Instructions

We want a script that does all model validation steps for us given one or more tfrecord datasets. Steps include the calculation of

Plotting of

Relevant background This feature would make it easy to automatically calculate performance metrics for different experiments and datasets. Part of the required functions are already available in metrics.py.

Design overview The following steps need to be performed in the script:

  1. Load model and validation data
  2. Load external validation data (i.e. proof read samples)

For each dataset:

  1. Predict samples, post-process to cell_tableand save as csv
  2. Calculate scores (f1, precision, recall, specificity) based on cell_table and do facetplot
  3. Calculate scores for each marker individually and plot heatmap
  4. Plot N worst tiles with input, gt, pred next to each other
  5. Store all results in a sub-folder of the experiment folder named the same as the tfrecord dataset file without suffix

Code mockup Model and dataset loading and metrics calculation is already implemented in metrics.py. Plotting functionality is partly implemented in plot_utils.py. Missing pieces need to be identified and added.

Required inputs

Parameters params.toml, model weights best_model.pkl and tfrecord files

Output files

Folder with cell_table.csv, plots and metrics stored as CSVs.

Timeline Give a rough estimate for how long you think the project will take. In general, it's better to be too conservative rather than too optimistic.

Estimated date when a fully implemented version will be ready for review:

Estimated date when the finalized project will be merged in: 01/23