byu-dml / d3m-experimenter

A distributed system for creating, running, and persisting many machine learning experiments.
0 stars 0 forks source link

For a few Datasets, Run All pipelines with Different Random Seeds #62

Closed bjschoenfeld closed 4 years ago

bjschoenfeld commented 5 years ago

Our pipeline F1 scores may have some variance in them, depending on the random seed used to initialize the pipelines. It may be the case that the F1 scores vary dramatically and are not significantly better or worse than one another.

We could select randomly a few datasets and run all the pipelines we have with 10 or 100 different random seeds and analyze the distributions of scores.

epeters3 commented 4 years ago

@bjschoenfeld has this been completed? :tada:

bjschoenfeld commented 4 years ago

Here is the mongo query that gets the results.

[
   {
       '$match': {
           'run.phase': 'PRODUCE',
           'status.state': 'SUCCESS'
       }
   }, {
       '$project': {
           'f1': '$run.results.scores',
           'pipeline_id': '$pipeline.id',
           'datasets': '$datasets.id'
       }
   }, {
       '$project': {
           'f1': {
               '$arrayElemAt': [
                   '$f1', 0
               ]
           },
           'pipeline_id': '$pipeline_id',
           'dataset': {
               '$arrayElemAt': [
                   '$datasets', 0
               ]
           }
       }
   }, {
       '$group': {
           '_id': {
               'pipeline_id': '$pipeline_id',
               'dataset_id': '$dataset'
           },
           'mean': {
               '$avg': '$f1.value'
           },
           'std': {
               '$stdDevSamp': '$f1.value'
           },
           'count': {
               '$sum': 1
           },
           'group': {
               '$push': '$f1.value'
           }
       }
   }, {
       '$project': {
           '_id': 0,
           'pipeline_id': '$_id.pipeline_id',
           'dataset_id': '$_id.dataset_id',
           'f1_macro_mean_over_runs': '$mean',
           'f1_macro_std_dev_over_runs': '$std',
           'n_runs': '$count',
           'group': '$group'
       }
   }
]