Suggested code changes - Githubissues

nicholas-denis commented 3 months ago

Things are looking good. Given how little supervision and feedback that I have provided to you, you have implemented quite a bit and done a great job. I wish I had more time to interact more with you and work on this. Unfortunately, that hasn't been the case. With that said, I have looked at the code and have noticed a few locations where we should make some changes. I will do my best to list them here:

compute_ci() inside of ppi.py
- let's add PPI++ in here
- lets prepare ourselves for stratified ppi
- Recommendations: let's write smaller helper functions that does the CI computations for PPI, PPI++, naive, classical, etc,
- inside compute_ci() rather than coding it all here, we call those functions, and those functions handle the raw compute
- for example: if config['experiment']['estimate'] == 'mean':
  PPI CI
  
  ppi_theta = ppi_py.ppi_mean_pointestimate(y_gold, y_gold_fitted, y_fitted) ppi_theta_ci = ppi_py.ppi_mean_ci(y_gold, y_gold_fitted, y_fitted)

becomes: ppi_theta, ppi_theta_ci = do_ppi_estimate_and_ci(y_gold, y_gold_fitted, y_fitted)

where

def do_ppi_estimate_and_ci(y_gold, y_gold_fitted, y_fitted, total_estimate=False): if total_estimate: pass # TODO: when we want to estimate for totals /proprtions else: ppi_theta = ppi_py.ppi_mean_pointestimate(y_gold, y_gold_fitted, y_fitted) ppi_theta_ci = ppi_py.ppi_mean_ci(y_gold, y_gold_fitted, y_fitted) return ppi_theta, ppi_theta_ci

build similar functions for naive, classical, ppi++ and a placeholder for stratified PPI

It is unclear to me why we have "primary" and "secondary" metrics?
- can you please explain why this is required in your code so far? From what I can tell, it is because only "primary_means" goes to plotting...
- Feel free to share your intuition on this - we can discuss, but I think this should be combined and we just have a single set of metrics
The dataframe (and csv) file. Right now, these are the columns: ppi_widths,naive_widths,classical_widths,ppi_preds,ppi_lowers,ppi_uppers,naive_preds,naive_lowers,naive_uppers,classical_preds,classical_lowers,classical_uppers,ppi_coverages,naive_coverages,classical_coverages,rho,iteration -My thoughts: I think we can simplify the table, but at the expense/overhead of having to change how the dataframe is being created/grown, and therefore likely how the metrics are represented over the course of the experiment
- Here are the table columns I would like: |estimate| true_parameter| ci_low | ci_high | ci_width | empirical_coverage | desired_coverage | noise | technique | model | repeat |

Explanation: technique will be a string in {naive, classical, ppi, ppi++} that tells us what this row represents in terms of technique model will be a string (or None) in {linear_regression, decision_tree, random_forest, .... } etc. This tells us the model class used for PPI. Repeat is an integer, if num_its is 100, then this will be integers in 0,1,...,99 desired_coverage comes from config file empirical_coverage, ci stuff and estimate is obvious true_parameter is what we expect the estimate to be (the population parameter we are trying to estimate). noise here is the rho value... but I made the column name more general, for when we do tabular data, we won't use rho, ....

My suggestion on how to implement this:

at the start of every experiment, before you start doing your iterations, you create an empty dataframe that just has the column names and no rows - google this, probably just provide a list of strings.
single_iteration(config) returns a list of dictionaries (instead of two dictionaries)
the list represents the techniques used. SO, if you are doing naive, classical, ppi, ppi++, stratPPI, then the list will be of length 5, one dict for each technique used
each list is a dict, and the dict has the key/value pairs for the dataframe I just described above
UNRELATED NOTE: some of the dataframe elements are floats and some are lists of length 1 of floats... check that - make sure we don't have lists in our dataframe, just floats
so [{'estimate': 2.212, 'technique': 'ppi', 'repeat': 21, 'model': 'random_forest', .... }, {'estimate':1.99, 'technique': 'naive', 'repeat':21, 'model': None, ...}, ... ] returns from single_iteration()
then look into pandas documentation, you can append a new row to a dataframe using the row as a dictionary. This may help: https://hackernoon.com/python-updating-and-appending-pandas-dataframe-using-dictionary
so you would just iterate over the list, append each dict to your dataframe, blah
you don't need to save two separate dataframes (mean and "raw"), you can always go get the mean from the raw.....

So later when we do plots, everything is in the dataframe, we just go get all the the values for a particular technique and get the appropriate column means

PLOTS:

don't worry about putting much, if any, in the config
config knows the plot path for this experiment, that is good enough
from our dataframe that has all of our experimental results, we can call some generic run_plots(dataframe, config) function
inside run_plots(*) we can call plots for coverage, bias, ci width, etc
the plots should probably have deterministic titles, and their paths can be found from the config file, so no need for this information to be in the config (let's simplify)
we know that dataframe has a particular column structure, so we can get all the estimate values for each experiment, do violin plots, get mean and ci's to plot error bars, etc
make sure all of our plots demonstrate variability, let's never simply plot the mean across the experiments (in a line plot, for example)

Aspiire commented 3 months ago

the old code used two dictionaries because the way the old code worked was that it would parse the primary_metrics dictionary then create the graphs based off that. This was done so that I didn't have to label my outputs as to plotted or not. With the new code, this should no longer be necessary.

Aspiire commented 3 months ago

Also, as a suggestion in this case, should there just be a new plotting.py file every time a new experiment is run. Plots should vary pretty heavily from experiment to experiment anyways, and it shouldn't be too hard to write down new plotting functions, so long as we have a base/reference to refer to.

Aspiire commented 3 months ago

New dataframe structure code completed

nicholas-denis / ppi-testing

Suggested code changes #11

PPI CI