byu-dml / d3m-experimenter

A distributed system for creating, running, and persisting many machine learning experiments.
0 stars 0 forks source link

Add Random Experiment #67

Closed epeters3 closed 4 years ago

epeters3 commented 4 years ago

Closes #51.

When reviewing this PR, I would recommend looking at the changes for each commit one at a time.

codecov-io commented 4 years ago

Codecov Report

Merging #67 into develop will decrease coverage by 4.57%. The diff coverage is 84.61%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop      #67      +/-   ##
===========================================
- Coverage     70.3%   65.72%   -4.58%     
===========================================
  Files           22       27       +5     
  Lines         1266     1701     +435     
===========================================
+ Hits           890     1118     +228     
- Misses         376      583     +207
Impacted Files Coverage Δ
experimenter/experiments/straight.py 70.45% <ø> (ø) :arrow_up:
experimenter/experiments/ensemble.py 79.33% <ø> (ø) :arrow_up:
test/test_ensembling_pipelines_generation.py 96.49% <ø> (-0.53%) :arrow_down:
test/test_pipeline_builder.py 96.96% <100%> (-0.05%) :arrow_down:
experimenter/experiments/experiment.py 80% <100%> (ø) :arrow_up:
experimenter/experiments/stacked.py 75% <100%> (ø) :arrow_up:
experimenter/pipeline_builder.py 98.03% <100%> (+2.84%) :arrow_up:
test/test_random_pipelines.py 100% <100%> (ø)
experimenter/experiments/metafeatures.py 45.45% <100%> (ø) :arrow_up:
test/test_cli.py 100% <100%> (ø)
... and 10 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 4669e93...3aebaec. Read the comment docs.

epeters3 commented 4 years ago

It looks like the above Travis CI Branch build failed because of a test that fails sometimes with low probability, even though the test still works. So I think we can consider the builds passing for this PR.

epeters3 commented 4 years ago

I’m so glad it was readable for you, thank you so much for reviewing it, I’m sorry it was so large.

On Wed, Oct 2, 2019 at 5:01 PM orionw notifications@github.com wrote:

@orionw approved this pull request.

Very readable code! Looks great - as does the command line args. They're starting to get large so I think this is a good way to handle it.

In experimenter/experiments/random.py https://github.com/byu-dml/d3m-experimenter/pull/67#discussion_r330806645 :

  • num_inputs = random.randint(1, max_num_inputs)
  • There are only len(output_collection) possible inputs available.

  • num_inputs = min(len(output_collection), num_inputs)
  • input_refs: Set[str] = set(random.sample(output_collection, num_inputs))
  • Since the steps identified by input_refs are now being used

  • as inputs, we know they're not terminal nodes anymore.

  • terminal_node_data_refs -= input_refs
  • concat_result_ref = self._concatenate_inputs(structure, input_refs, concat_cache)
  • Add the do nothing primitive as a placeholder. self._sample_pipeline will fill

  • in the primitives later.

  • add_pipeline_step(
  • structure,
  • 'd3m.primitives.data_preprocessing.do_nothing.DSBOX',
  • concat_result_ref

I like the no_nothing primitive here!

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/byu-dml/d3m-experimenter/pull/67?email_source=notifications&email_token=AFBLEZB3AJGDFJ476FDIAWLQMUR4PA5CNFSM4I44HEBKYY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCGWVBLI#pullrequestreview-296571053, or mute the thread https://github.com/notifications/unsubscribe-auth/AFBLEZA7AZLZZDH2UJOC7BDQMUR4PANCNFSM4I44HEBA .

bjschoenfeld commented 4 years ago

It looks like the above Travis CI Branch build failed because of a test that fails sometimes with low probability, even though the test still works. So I think we can consider the builds passing for this PR.

Can we make the test deterministic, so that it passes every time?

orionw commented 4 years ago

@bjschoenfeld it was supposed to test randomness - we have other tests that do it without the slight probability of failure. I'm for just taking the test out - the other tests cover it already. This test was supposed to be non-deterministic, back when I made it a long time ago.